#2776 Is it possible to compute checksums of large files using pure Fantom?

LightDye Mon 11 Nov 2019

Hi everyone,

I'm trying to compute file checksums using pure Fantom. The simplest way I found is using this form

file.readAllBuf.toDigest( algorithm ).toHex

where file is just an instance of File and algorithm is for example "MD5" or "SHA-256", or any other supported algorithm.

It all works nicely until I try to run this against a large file of 280 MB, which causes:

sys::Err: java.lang.OutOfMemoryError: Java heap space
  fan.sys.MemBuf.grow (MemBuf.java:215)
  fan.sys.MemBuf.pipeFrom (MemBuf.java:148)
  fan.sys.SysInStream.readBuf (SysInStream.java:75)
  fan.sys.InStream.readAllBuf (InStream.java:170)
  fan.sys.File.readAllBuf (File.java:389)

No surprise as I'm using File.readAllBuf() method.

The problem is that Buf.toDigest ...

Apply the specified message digest algorthm to this buffer's contents from 0 to size

Is it possible to process small chunks of the file to compute the checksum in a similar way that update methods do, from the Java class MessageDigest?

Something like How to Calculate File Checksum MD5, SHA in Java but in pure Fantom?

go4 Wed 13 Nov 2019

Try it like this:

buf := file.open

LightDye Wed 13 Nov 2019

Hi go4,

Well, this is embarrassing. I never thought it was going to be so easy, but is Fantom, so I should have expected so. I just compared some checksums generated using a Java version with your Fantom version and yours works perfectly. Thank you so much.

SlimerDude Wed 13 Nov 2019

Well, this is embarrassing.

No, not at all. What is not apparent to the end-user / end-programmer is that there are often lots of implementations to a lot of the core classes, like Buf.

So while the common Buf is an instance of MemBuf, or an in-memory Buf, there are also other more specialised instances - as demonstrated above.

Thank you Go4 for the answer, for that's something I wouldn't have thought / known about! (How embarrassing!)

Login or Signup to reply.