This Week in Mojo 2023-06-09
Mojo Playground Update
Tuple syntax now works on the left-hand side of assignments (in “lvalue” positions), enabling things like (a, b) = (b, a).
There are several caveats: the element types must exactly match (no implicit conversions), this only work with values of TupleLiteral type (notably, it will not work with PythonObject yet) and parentheses are required for tuple syntax.
Mojo Playground no longer includes the following Python packages (due to size, compute costs, and environment complications): torch, tensorflow, keras, transformers.
The data types and scalar names now conform to the naming convention used by numpy. So we use Int32 instead of SI32, similarly using Float32 instead of F32. Closes Issue Issue #152.
- Issue #287 - computed lvalues don’t handle raising functions correctly
- Issue #318- Large integers are not being printed correctly
- Issue #326 - Float modulo operator is not working as expected
- Issue #282- Default arguments are not working as expected
- Issue #271- Confusing error message when converting between function types with different result semantics
- New Proposal: Provenance Tracking and Lifetimes in Mojo
- New Proposal: Keyword Naming
- Blog Post: If AI serving tech can’t solve today’s problems, how do we scale into the future?
- Blog Post: Do LLMs eliminate the need for programming languages?
Mojo Team Answers
Pureness is what is known as an "effect" in PL terminology. You can see this in the handling of async and raises in the current mojo implementation: a non-raising function is not allowed to call a raising function directly - it must wrap it in a try block.
I don't see a way to provide this sort of mapping from one world to the other for purity, I think we cannot practically implement this, and while pure computation is important, it is actually quite complicated: is reading from memory pure? If no, "purity" is pretty useless. If so, you cannot use purity information for much optimization, because you need to know which memory sets may be read and written by functions anyway.
Also, in other pure-functional languages like Haskell, you need escape hatches (perform unsafe io) because you want to add printf debugging etc to "pure" functions and compiler enforcement makes that whole thing incredibly difficult.
Overall I can understand wanting to have this conceptually, but I can't see how it could work out well in practice. We can come back to this later as the language evolves.
Mojo Champion mod on Discord
We reached out to individuals we identified ourselves this time. In the future as the server scales, if we look to add more, we will probably send out an application form that folks can fill out and we'll review on a rolling basis.
String to PythonObject
Right now you can turn a
StringRef or a
StringLiteral into a
PythonObject. To get a
PythonObject from a
String, you'd need to turn the
String into a
StringRef. This is available through some underscored methods, but it's currently unsafe due to some lifetime issues. Let me see if I can add a direct conversion path, though it will take a week to make its way to the playground.
A direct conversion should be included in the next Playground release.
Mojo already gives a couple warnings that suggest better things to do, such as using
let instead of
var where possible. That said, the compiler isn't good at pointing out larger design pattern changes, for this I think we'll have LLM based tools outside the compiler itself. The UI is much better for explaining things in that context.
Compile time metaprogramming relationship to MLIR
Mojo has great support for evaluating fairly arbitrary expressions at compile time with an interpreter that (under the covers) ends up calling an MLIR dialect's fold operations.
These then get wrapped up in structs to give a new programmable veneer etc. Check out the Bool workbook example in the documentation for a simple example of doing this with the index dialect.
Mojo is designed "for" MLIR in this way - MLIR can talk to roughly anything that computes, and it is very important (over time) for Mojo to scale into new forms of computation, whether it be low level things like low-level tensorcore operators, mid-level things like a shape dialect, or high level things like an ML operator graph.
Right now many folks on the channel are excited about a Python++, but Mojo was designed to work backwards from the "speed of light" of hardware and accelerators. The syntax and applicability to Python is important for community reasons, but not particularly relevant to the accelerator side of Mojo.
This is an evolving part of the language and likely another difference we pull into the
def world, in a
def we could default to getting objects for literals, but within a
fn you get typed literals. Another potential solution is to have aggressive decay rules in
True starts out being typed to
Bool but we allow decaying to object when an expression doesn't type check otherwise. We'll need to experiment with that when we make progress on other more basic things. The major reason to have both
fn is to have a Python compatible world and a stricter systems programmer world, and have them coexist seamlessly.
Struct Memory Layout C Compatibility
I agree that an opt-in decorator that specifies layout is the right way to go. By default the compiler should be able to reorder fields to eliminate internal padding so programmers don't have to worry about this, but people putting bits on a wire or dealing with c compatibility should be able to get that. We will need to properly design this out.
Ints and pointers are different things, so no ints don't carry provenance. This is one of the major things that C/C++ got wrong that has haunted LLVM IR and many other things for a long time. Taking a hard line on this makes everything simpler, but that is only possible when you have a fresh slate like Mojo provides us.
There are so many variants of Float8 representation. We need to think about which ones does Mojo represents and how to expose the variety. For now, we are removing Float8 from the DType list to avoid folks from falling into this trap.
Integer Overflow on
It needs to eventually provide full Python semantics, so we'll need
object to contain a
PythonObject in its variant. We could overflow from inline
int to Python object on demand.
Boolean on SIMD types
The way to do this is by explicitly calling the bool method later:
struct MyPair: var first: Float32 var second: Float32 fn __lt__(self, rhs: MyPair) -> Bool: return ( self.first < rhs.first or (self.first == rhs.first and self.second < rhs.second) ).__bool__()
We could add
SIMD[DType.bool, 1] as an initializer to the
Bool type, but cannot do that currently because
Bool is a builtin type while
SIMD is not. We need to think about this and have a library-based solution.
String supporting UTF-8
We want to enhance the
String type to support UTF-8 encoding before starting work on file system.
Mutable and explicit types when iterating over collections
This was noted as a known
sharp edge in the roadmap & sharp edges document. The behaviour here is definitely subject to change, maybe syntax like
for var i in range(3) but I don't have a strong opinion.
Local Toolchain Release
We are working on this, and expect to ship it in
O(few months)! Please sign up for our newsletter to track progress, thanks!