Re: [Esug-list] ESUG SummerTalk - Fuel, binary object serializer

On May 25, 2011, at 7:35 PM, Yoshiki Ohshima wrote:
At Wed, 25 May 2011 15:28:00 +0200, Mariano Martinez Peck wrote:
One of the most important uses we want to do with Fuel (in a future) is to be able to use it for Monticello (to replace mzc). The idea in addition is to be able to boostrap a really small pharo image (hetzel) and be able to load stuff without needing a compiler.
Sounds interesting! Can I learn about "hetzel" somewhere?
We are working on how to build minimal Pharo kernels (with the goal of a declarative bootstrap so you can build your own version of Smalltalk running in the normal Pharo image to experiment with new language features, as one example, or create a minimal image for deployment). There is some information from Nicolas Paez who worked on this when visiting RMOD last year: http://www.fast.org.ar/smalltalks2010/videos/Seed+project%3A+The+challenge+o... http://rmod.lille.inria.fr/archives/talks/2010-Smalltalks-Paez-Seed.pdf The status is that we learned a lot and continue to work on it... more to come later ;-) Marcus -- Marcus Denker -- http://www.marcusdenker.de INRIA Lille -- Nord Europe. Team RMoD.

On Thu, May 26, 2011 at 9:51 AM, Marcus Denker <marcus.denker@inria.fr>wrote:
On May 25, 2011, at 7:35 PM, Yoshiki Ohshima wrote:
At Wed, 25 May 2011 15:28:00 +0200, Mariano Martinez Peck wrote:
One of the most important uses we want to do with Fuel (in a future) is
to be able to use it for Monticello (to replace mzc).
The idea in addition is to be able to boostrap a really small pharo image (hetzel) and be able to load stuff without needing a compiler.
Sounds interesting! Can I learn about "hetzel" somewhere?
We are working on how to build minimal Pharo kernels (with the goal of a declarative bootstrap so you can build your own version of Smalltalk running in the normal Pharo image to experiment with new language features, as one example, or create a minimal image for deployment).
There is some information from Nicolas Paez who worked on this when visiting RMOD last year:
http://www.fast.org.ar/smalltalks2010/videos/Seed+project%3A+The+challenge+o... http://rmod.lille.inria.fr/archives/talks/2010-Smalltalks-Paez-Seed.pdf
The status is that we learned a lot and continue to work on it... more to come later ;-)
Ben is now working in the project. He will soon create a website inside RMOD with some more information. He has also some slides and a full report explaing about it :)
Marcus
-- Marcus Denker -- http://www.marcusdenker.de INRIA Lille -- Nord Europe. Team RMoD.
_______________________________________________ Esug-list mailing list Esug-list@lists.esug.org http://lists.esug.org/mailman/listinfo/esug-list_lists.esug.org
-- Mariano http://marianopeck.wordpress.com

At Thu, 26 May 2011 09:51:00 +0200, Marcus Denker wrote:
On May 25, 2011, at 7:35 PM, Yoshiki Ohshima wrote:
At Wed, 25 May 2011 15:28:00 +0200, Mariano Martinez Peck wrote:
One of the most important uses we want to do with Fuel (in a future) is to be able to use it for Monticello (to replace mzc). The idea in addition is to be able to boostrap a really small pharo image (hetzel) and be able to load stuff without needing a compiler.
Sounds interesting! Can I learn about "hetzel" somewhere?
We are working on how to build minimal Pharo kernels (with the goal of a declarative bootstrap so you can build your own version of Smalltalk running in the normal Pharo image to experiment with new language features, as one example, or create a minimal image for deployment).
There is some information from Nicolas Paez who worked on this when visiting RMOD last year:
http://www.fast.org.ar/smalltalks2010/videos/Seed+project%3A+The+challenge+o... http://rmod.lille.inria.fr/archives/talks/2010-Smalltalks-Paez-Seed.pdf
The status is that we learned a lot and continue to work on it... more to come later ;-)
Yay! Declarative bootstrap is good. MicroSqueak is mostly classes and code with a bit of setup, so I thought if the tracer (MicroSqueakImageBuilder) takes somewhat different format but essentially with the same information, you can build an image and then whatever code loader that can load Compiler, you can get to a working state. But as you say, there must be details and I'm all curious to learn also! -- Yoshiki

BTW, the page (http://rmod.lille.inria.fr/web/pier/software/Fuel) says: Gofer new squeaksource: 'Fuel'; package: 'ConfigurationOfFuel'; load. ((Smalltalk at: #ConfigurationOfFuel) project =latestVersion) load: #(Core Tests Benchmarks). it does not appear to be a valid expression. What is the right expression? -- Yoshiki

On Thu, May 26, 2011 at 9:29 PM, Yoshiki Ohshima <yoshiki@vpri.org> wrote:
BTW, the page (http://rmod.lille.inria.fr/web/pier/software/Fuel) says:
Gofer new squeaksource: 'Fuel'; package: 'ConfigurationOfFuel'; load. ((Smalltalk at: #ConfigurationOfFuel) project =latestVersion) load: #(Core Tests Benchmarks).
it does not appear to be a valid expression. What is the right expression?
the = Try: Gofer new squeaksource: 'Fuel'; package: 'ConfigurationOfFuel'; load. ((Smalltalk at: #ConfigurationOfFuel) project latestVersion) load: #(Core Tests Benchmarks).
-- Yoshiki
_______________________________________________ Esug-list mailing list Esug-list@lists.esug.org http://lists.esug.org/mailman/listinfo/esug-list_lists.esug.org
-- Mariano http://marianopeck.wordpress.com

On Thu, May 26, 2011 at 4:41 PM, Mariano Martinez Peck < marianopeck@gmail.com> wrote:
On Thu, May 26, 2011 at 9:29 PM, Yoshiki Ohshima <yoshiki@vpri.org> wrote:
BTW, the page (http://rmod.lille.inria.fr/web/pier/software/Fuel) says:
Gofer new squeaksource: 'Fuel'; package: 'ConfigurationOfFuel'; load. ((Smalltalk at: #ConfigurationOfFuel) project =latestVersion) load: #(Core Tests Benchmarks).
it does not appear to be a valid expression. What is the right expression?
the =
Hi! Sorry, now it is fixed.
Try:
Gofer new squeaksource: 'Fuel'; package: 'ConfigurationOfFuel'; load. ((Smalltalk at: #ConfigurationOfFuel) project latestVersion) load: #(Core Tests Benchmarks).
-- Yoshiki
_______________________________________________ Esug-list mailing list Esug-list@lists.esug.org http://lists.esug.org/mailman/listinfo/esug-list_lists.esug.org
-- Mariano http://marianopeck.wordpress.com
_______________________________________________ Esug-list mailing list Esug-list@lists.esug.org http://lists.esug.org/mailman/listinfo/esug-list_lists.esug.org

At Thu, 26 May 2011 16:45:38 -0300, Martin Dias wrote:
On Thu, May 26, 2011 at 4:41 PM, Mariano Martinez Peck <marianopeck@gmail.com> wrote:
On Thu, May 26, 2011 at 9:29 PM, Yoshiki Ohshima <yoshiki@vpri.org> wrote:
BTW, the page (http://rmod.lille.inria.fr/web/pier/software/Fuel) says:
Gofer new squeaksource: 'Fuel'; package: 'ConfigurationOfFuel'; load. ((Smalltalk at: #ConfigurationOfFuel) project =latestVersion) load: #(Core Tests Benchmarks).
it does not appear to be a valid expression. What is the right expression?
the =
Hi! Sorry, now it is fixed.
Thanks! I tried it a bit and I'm officially impressed with its performancce ^^; I had a simple serializer/materializer that only handles class definitions and compiled methods (and initialization of classes). But Fuel seems faster than that for a factor of two or so for reading methods. (Mine was more on simplicity so it has only a few methods and does some slow way to read sized-strings, as well as generous 4 byte padding. It does make difference when it comes to performance.) -- Yoshiki

On Fri, May 27, 2011 at 7:37 PM, Yoshiki Ohshima <yoshiki@vpri.org> wrote:
At Thu, 26 May 2011 16:45:38 -0300, Martin Dias wrote:
On Thu, May 26, 2011 at 4:41 PM, Mariano Martinez Peck <
marianopeck@gmail.com> wrote:
On Thu, May 26, 2011 at 9:29 PM, Yoshiki Ohshima <yoshiki@vpri.org>
wrote:
BTW, the page (http://rmod.lille.inria.fr/web/pier/software/Fuel
)
says:
Gofer new squeaksource: 'Fuel'; package: 'ConfigurationOfFuel'; load. ((Smalltalk at: #ConfigurationOfFuel) project =latestVersion)
load: #(Core Tests Benchmarks).
it does not appear to be a valid expression. What is the right expression?
the =
Hi! Sorry, now it is fixed.
Thanks! I tried it a bit and I'm officially impressed with its performancce ^^;
I had a simple serializer/materializer that only handles class definitions and compiled methods (and initialization of classes). But Fuel seems faster than that for a factor of two or so for reading methods. (Mine was more on simplicity so it has only a few methods and does some slow way to read sized-strings, as well as generous 4 byte padding. It does make difference when it comes to performance.)
Sounds great! Do you have it in a public repository?
-- Yoshiki
_______________________________________________ Esug-list mailing list Esug-list@lists.esug.org http://lists.esug.org/mailman/listinfo/esug-list_lists.esug.org

At Fri, 27 May 2011 19:53:46 -0300, Martin Dias wrote:
Thanks! I tried it a bit and I'm officially impressed with its performancce ^^;
I had a simple serializer/materializer that only handles class definitions and compiled methods (and initialization of classes). But Fuel seems faster than that for a factor of two or so for reading methods. (Mine was more on simplicity so it has only a few methods and does some slow way to read sized-strings, as well as generous 4 byte padding. It does make difference when it comes to performance.)
Sounds great! Do you have it in a public repository?
Sure. It is here: https://github.com/yoshikiohshima/SqueakBootstrapper The system is just enough to load a compiler so it does not handle curly braces nor pragmas or primitives. But By setting the path to the Squeak VM in the Makefile, it should run on major platforms where the VM takes command line arguments. -- Yoshiki

Yoshiki if you want to help testing, improving fuel you are welcome. The idea is to make it fast fast fast without vm support. Stef On May 28, 2011, at 12:37 AM, Yoshiki Ohshima wrote:
At Thu, 26 May 2011 16:45:38 -0300, Martin Dias wrote:
On Thu, May 26, 2011 at 4:41 PM, Mariano Martinez Peck <marianopeck@gmail.com> wrote:
On Thu, May 26, 2011 at 9:29 PM, Yoshiki Ohshima <yoshiki@vpri.org> wrote:
BTW, the page (http://rmod.lille.inria.fr/web/pier/software/Fuel) says:
Gofer new squeaksource: 'Fuel'; package: 'ConfigurationOfFuel'; load. ((Smalltalk at: #ConfigurationOfFuel) project =latestVersion) load: #(Core Tests Benchmarks).
it does not appear to be a valid expression. What is the right expression?
the =
Hi! Sorry, now it is fixed.
Thanks! I tried it a bit and I'm officially impressed with its performancce ^^;
I had a simple serializer/materializer that only handles class definitions and compiled methods (and initialization of classes). But Fuel seems faster than that for a factor of two or so for reading methods. (Mine was more on simplicity so it has only a few methods and does some slow way to read sized-strings, as well as generous 4 byte padding. It does make difference when it comes to performance.)
-- Yoshiki
_______________________________________________ Esug-list mailing list Esug-list@lists.esug.org http://lists.esug.org/mailman/listinfo/esug-list_lists.esug.org

At Sat, 28 May 2011 09:33:27 +0200, Stéphane Ducasse wrote:
Yoshiki
if you want to help testing, improving fuel you are welcome. The idea is to make it fast fast fast without vm support.
Yeah. The reason for example I went to pad data to 4 bytes was that there may be a clever trick I may be able to do to read data into arrays "directly" and stuff, but did not get around implementing it. Next time I can try things, I'll give more close attention to Fuel... -- Yoshiki

On Sun, May 29, 2011 at 8:45 PM, Yoshiki Ohshima <yoshiki@vpri.org> wrote:
At Sat, 28 May 2011 09:33:27 +0200, Stéphane Ducasse wrote:
Yoshiki
if you want to help testing, improving fuel you are welcome. The idea is to make it fast fast fast without vm support.
Yeah. The reason for example I went to pad data to 4 bytes
What do you mean by padding data to 4 bytes? I don't understand :(
was that there may be a clever trick I may be able to do to read data into arrays "directly" and stuff,
what do you mean by that? like ImageSegment does? I mean, include also object headers in the stream and then avoid to recreate objects (using #basicNew) ?
but did not get around implementing it. Next time I can try things, I'll give more close attention to Fuel...
thanks :)
-- Yoshiki
_______________________________________________ Esug-list mailing list Esug-list@lists.esug.org http://lists.esug.org/mailman/listinfo/esug-list_lists.esug.org
-- Mariano http://marianopeck.wordpress.com

At Sun, 29 May 2011 20:52:45 +0200, Mariano Martinez Peck wrote:
On Sun, May 29, 2011 at 8:45 PM, Yoshiki Ohshima <yoshiki@vpri.org> wrote:
At Sat, 28 May 2011 09:33:27 +0200, Stéphane Ducasse wrote: > > Yoshiki > > if you want to help testing, improving fuel you are welcome. > The idea is to make it fast fast fast without vm support.
Yeah. The reason for example I went to pad data to 4 bytes
What do you mean by padding data to 4 bytes? I don't understand :(
Say a string has length of 5 ('abcde'), the data in file for it would be "97 98 99 100 101 0 0 0".
was that there may be a clever trick I may be able to do to read data into arrays "directly" and stuff,
what do you mean by that? like ImageSegment does? I mean, include also object headers in the stream and then avoid to recreate objects (using #basicNew) ?
Not like ImageSegment, but more like #hackBits: reading from file into a string object of right size. For pointer arrays and actual literal fields, I am not sure I can apply this however. I would not put object headers, as you would agree^^; -- Yoshiki

On Sun, May 29, 2011 at 9:07 PM, Yoshiki Ohshima <yoshiki@vpri.org> wrote:
At Sun, 29 May 2011 20:52:45 +0200, Mariano Martinez Peck wrote:
On Sun, May 29, 2011 at 8:45 PM, Yoshiki Ohshima <yoshiki@vpri.org>
wrote:
At Sat, 28 May 2011 09:33:27 +0200, Stéphane Ducasse wrote: > > Yoshiki > > if you want to help testing, improving fuel you are welcome. > The idea is to make it fast fast fast without vm support.
Yeah. The reason for example I went to pad data to 4 bytes
What do you mean by padding data to 4 bytes? I don't understand :(
Say a string has length of 5 ('abcde'), the data in file for it would be "97 98 99 100 101 0 0 0".
Sorry but I don't understand then how that is related to "The reason for example I went to pad data to 4 bytes was that there may be a clever trick I may be able to do to read data into arrays "directly" and stuff, but did not get around implementing it."
was that there may be a clever trick I may be able to do to read data into arrays "directly" and stuff,
what do you mean by that? like ImageSegment does? I mean, include also
object headers in the stream and then avoid to recreate objects (using #basicNew) ?
Not like ImageSegment, but more like #hackBits: reading from file into a string object of right size.
I am not sure if I understand. For some kind of objects (those that are variable) we store first its size and then at materialization time we directly create an object of that size using #basicNew: but I guess you are talking about something else.
For pointer arrays and actual literal fields, I am not sure I can apply this however.
I would not put object headers, as you would agree^^;
-- Yoshiki
-- Mariano http://marianopeck.wordpress.com

At Tue, 31 May 2011 23:16:12 +0200, Mariano Martinez Peck wrote:
On Sun, May 29, 2011 at 9:07 PM, Yoshiki Ohshima <yoshiki@vpri.org> wrote:
At Sun, 29 May 2011 20:52:45 +0200, Mariano Martinez Peck wrote: > > On Sun, May 29, 2011 at 8:45 PM, Yoshiki Ohshima <yoshiki@vpri.org> wrote: > > At Sat, 28 May 2011 09:33:27 +0200, > Stéphane Ducasse wrote: > > > > Yoshiki > > > > if you want to help testing, improving fuel you are welcome. > > The idea is to make it fast fast fast without vm support. > > Yeah. The reason for example I went to pad data to 4 bytes > > What do you mean by padding data to 4 bytes? I don't understand :(
Say a string has length of 5 ('abcde'), the data in file for it would be "97 98 99 100 101 0 0 0".
Sorry but I don't understand then how that is related to "The reason for example I went to pad data to 4 bytes was that there may be a clever trick I may be able to do to read data into arrays "directly" and stuff, but did not get around implementing it."
> was that > there may be a clever trick I may be able to do to read data into > arrays "directly" and stuff, > > what do you mean by that? like ImageSegment does? I mean, include also object headers in the stream and then avoid to recreate objects (using #basicNew) ?
Not like ImageSegment, but more like #hackBits: reading from file into a string object of right size.
I am not sure if I understand. For some kind of objects (those that are variable) we store first its size and then at materialization time we directly create an object of that size using #basicNew:
but I guess you are talking about something else.
Hmm, I'm also confused. You know that ByteStrings are padded in the image, right? And you see a user of #hackBits:, #nextWordsInto:, right? We can eliminate some buffer copying there. For example, when I run Fuel materializer, I see PositionableStream>>nextString is used which does not do basicNew: of (Byte)String but actually does extra copy (which can be easily eliminated and would make it faster). -- Yoshiki

On Sat, May 28, 2011 at 12:37 AM, Yoshiki Ohshima <yoshiki@vpri.org> wrote:
At Thu, 26 May 2011 16:45:38 -0300, Martin Dias wrote:
On Thu, May 26, 2011 at 4:41 PM, Mariano Martinez Peck <
marianopeck@gmail.com> wrote:
On Thu, May 26, 2011 at 9:29 PM, Yoshiki Ohshima <yoshiki@vpri.org>
wrote:
BTW, the page (http://rmod.lille.inria.fr/web/pier/software/Fuel
)
says:
Gofer new squeaksource: 'Fuel'; package: 'ConfigurationOfFuel'; load. ((Smalltalk at: #ConfigurationOfFuel) project =latestVersion)
load: #(Core Tests Benchmarks).
it does not appear to be a valid expression. What is the right expression?
the =
Hi! Sorry, now it is fixed.
Thanks! I tried it a bit and I'm officially impressed with its performancce ^^;
I had a simple serializer/materializer that only handles class definitions and compiled methods (and initialization of classes). But Fuel seems faster than that for a factor of two or so for reading methods.
Thanks Yoshiki. So you said Fuel was 2x faster at reading...did you check in writing (serializing) ? we would like to know the difference if possible. Cheers Mariano
(Mine was more on simplicity so it has only a few methods and does some slow way to read sized-strings, as well as generous 4 byte padding. It does make difference when it comes to performance.)
-- Yoshiki
_______________________________________________ Esug-list mailing list Esug-list@lists.esug.org http://lists.esug.org/mailman/listinfo/esug-list_lists.esug.org
-- Mariano http://marianopeck.wordpress.com

At Sat, 28 May 2011 19:26:03 +0200, Mariano Martinez Peck wrote:
Thanks Yoshiki. So you said Fuel was 2x faster at reading...did you check in writing (serializing) ? we would like to know the difference if possible.
Well, the serializing part is definitely not tuned for performance. It is even written in OMeta2! It does not support all bells and whisles with new kinds of literals and stuff so I cannot really do a meaningful comparison with Fuel. But if I compare the serializing time and deserializing time of my thing, writing is about twice slower than reading. -- Yoshiki

On Sun, May 29, 2011 at 8:48 PM, Yoshiki Ohshima <yoshiki@vpri.org> wrote:
At Sat, 28 May 2011 19:26:03 +0200, Mariano Martinez Peck wrote:
Thanks Yoshiki. So you said Fuel was 2x faster at reading...did you check
in writing (serializing) ?
we would like to know the difference if possible.
Well, the serializing part is definitely not tuned for performance. It is even written in OMeta2! It does not support all bells and whisles with new kinds of literals and stuff so I cannot really do a meaningful comparison with Fuel.
I understand.
But if I compare the serializing time and deserializing time of my thing, writing is about twice slower than reading.
Yes, I think that is the normal and more or less the expected difference right now. -- Mariano http://marianopeck.wordpress.com

At Sun, 29 May 2011 20:51:07 +0200, Mariano Martinez Peck wrote:
But if I compare the serializing time and deserializing time of my thing, writing is about twice slower than reading.
Yes, I think that is the normal and more or less the expected difference right now.
Yes, more or less. It, however, seems that if I don't use OMeta2, it can be quite a lot faster. -- Yoshiki

On 05/29/2011 08:51 PM, Mariano Martinez Peck wrote:
But if I compare the serializing time and deserializing time of my thing, writing is about twice slower than reading.
Yes, I think that is the normal and more or less the expected difference right now.
Do you want to make some comparisons with the GNU Smalltalk ObjectDumper? I can help, or I can even make some comparisons myself if you give me some test Fuel code and an image to test with. Paolo

On Mon, May 30, 2011 at 8:27 AM, Paolo Bonzini <bonzini@gnu.org> wrote:
On 05/29/2011 08:51 PM, Mariano Martinez Peck wrote:
But if I compare the serializing time and deserializing time of my thing, writing is about twice slower than reading.
Yes, I think that is the normal and more or less the expected difference right now.
Do you want to make some comparisons with the GNU Smalltalk ObjectDumper? I can help, or I can even make some comparisons myself if you give me some test Fuel code and an image to test with.
Hi Paolo. ObjectDumper was one of the serializers that we wanted to analyze also to get ideas and compare. This is why once I tried to install GNU Smalltalk in my machine but I couldn't so I give up. The problem to compare ObjectDumper is that we need to be able to run both in the same Dialect/VM. Otherwise the difference of the dialect or VM can change the results a lot. Sorry for the offtopic but is there an easy way to install GNU Smalltalk in a Mac OS 10.6.7 ? Thanks -- Mariano http://marianopeck.wordpress.com

On 05/30/2011 10:02 AM, Mariano Martinez Peck wrote:
The problem to compare ObjectDumper is that we need to be able to run both in the same Dialect/VM. Otherwise the difference of the dialect or VM can change the results a lot.
That's true.
Sorry for the offtopic but is there an easy way to install GNU Smalltalk in a Mac OS 10.6.7 ?
With 3.2.4 and MacPorts it should be easy. You need to install the libsigsegv port first. It may not work right away if your compiler doesn't look in MacPorts paths. You can preempt the issue with an extra option to the configure script: ./configure --with-system-libsigsegv=/opt/local/lib --prefix=/opt/local make sudo make install (or /opt/local/lib64, I don't know :)). ObjectDumper should be easy to port to Pharo. Something like gst-convert -f gst -F squeak /opt/local/smalltalk/kernel/ObjDumper.st | tr '\n' '\r' > objdumper.sq should be a good start. Check out category "private - binary I/O", that's where changes might be made. The resulting streams should be portable between dialects, though I've never tried. Testcases are old so they have not been converted to SUnit, but I'm sure someone in the GNU Smalltalk community can help. :> Lastly, it was written entirely by me, so I can relicense parts of it if required. Contact me off-list if you are interested. HTH, Paolo

Yes if you do not use the same vm , the problem is that you will compare apple and orange. Stef
On Mon, May 30, 2011 at 8:27 AM, Paolo Bonzini <bonzini@gnu.org> wrote: On 05/29/2011 08:51 PM, Mariano Martinez Peck wrote: But if I compare the serializing time and deserializing time of my thing, writing is about twice slower than reading.
Yes, I think that is the normal and more or less the expected difference right now.
Do you want to make some comparisons with the GNU Smalltalk ObjectDumper? I can help, or I can even make some comparisons myself if you give me some test Fuel code and an image to test with.
Hi Paolo. ObjectDumper was one of the serializers that we wanted to analyze also to get ideas and compare. This is why once I tried to install GNU Smalltalk in my machine but I couldn't so I give up. The problem to compare ObjectDumper is that we need to be able to run both in the same Dialect/VM. Otherwise the difference of the dialect or VM can change the results a lot.
Sorry for the offtopic but is there an easy way to install GNU Smalltalk in a Mac OS 10.6.7 ?
Thanks
-- Mariano http://marianopeck.wordpress.com
_______________________________________________ Esug-list mailing list Esug-list@lists.esug.org http://lists.esug.org/mailman/listinfo/esug-list_lists.esug.org
participants (7)
-
Marcus Denker
-
Mariano Martinez Peck
-
Martin Dias
-
Paolo Bonzini
-
stephane ducasse
-
Stéphane Ducasse
-
Yoshiki Ohshima