back to article Startup offers penalty-free file data reduction

Swiss startup balesio, staffed by all of nine people, has devised a penalty-free way of reducing unstructured data file sizes without altering the original file format, meaning no rehydration or decompression is needed to read the reduced size files. Its Native File Optimisation (NFO) software technology analyses unstructured …

COMMENTS

This topic is closed for new posts.
  1. Tim Parker
    Thumb Down

    Press release

    I thought, briefly, that we'd stopped regurgitating storage company press releases with basically zero analysis.

    Seems I was wrong again.

    1. Chris Mellor 1

      Visually lossless

      In the demo I saw there was a balesio run against a couple of images. The resulting output files were much smaller than the input files. Their on-screen dimensions were the same and their on-screen appearance to my eyes was the same as well.

      As far as I can see the optimisation technology is visually lossless (to human eyes). It does what it says on the can.

      Also, just to enjoy a tart comment for a second, there was no press release, the story being based on an interview.

      Chris.

      1. Tim Parker

        Smaller

        "In the demo I saw there was a balesio run against a couple of images. The resulting output files were much smaller than the input files. "

        I can show this with compression and file optimizations for certain files, but not in general. There's a reason for that.

        "Also, just to enjoy a tart comment for a second, there was no press release, the story being based on an interview."

        Fair enough - i'll adjust my comment - "sounding like a press release". If you know anything about compression, file formats and/or information theory - which i'm sure you do - i'd hope you could see the lack of any useful analysis on this "technology", and why it doesn't (cannot) work in general as well as portrayed.

  2. a53
    Jobs Halo

    Hmmm

    So will there be an Apple Mac version then?

  3. Anonymous Coward
    Anonymous Coward

    So what are these "unstructured" file formats anyway?

    I knew redmondian stuff is a bit of a jumble but it's not quite freeform plain text. Apparently the implicit claim of no application specific knowledge in that compressor simply isn't true.

    1. Filippo Silver badge

      Re: So what...

      I don't know why they keep repeating the word "unstructured", though, and the term is definitely misleading. But the compressor is obviously aware of the file format. I can see no claim to the contrary - they actually say that they started with Microsoft Office files and moved on to PDFs.

  4. Kanhef

    Ever see HTML written in MS Word?

    It's hideous. In addition to all the Microsoft-only stuff, the same complex style tags are used over and over. Turning it into plain HTML reliably reduces file size by 80%. It sounds like they've found a way to automate that sort of process.

    'Structured' files are typically binary formats, where data is stored at fixed offsets within the file. Unlike XML, there's no way to shorten those without corrupting the file.

  5. JL 1
    Paris Hilton

    Visually unchanged? Ha Ha

    So the software does change the data - presumably irrepairably - by downscaling image sizes in documents. This is crazy talk for automated enterprise use. Imagine the support calls - "Hey, Helpdesk! Who the heck reduced my high resolution image to a tiny jpeg?" Crazy talk, hence Paris.

  6. DrDedupe
    Stop

    In the interest of fair reporting

    Chris allow me to point out a few things here:

    1) "Scaling back" colour and resolution attributes may not be desirable, especially in regulatory and compliance instances.

    2) the 5%-10% savings attributed to NetApp dedupe is based on one customer's comment, hardly represents an installed base of tens of thousands of dedupe Users.

    3) Penalty-free? Dedupe's value in primary storage is providing reasonable capacity savings without degrading performance. Dissecting files, rescaling, and removing duplicate images, seems like some mighty heavy lifting to me - why no mention of the performance penalty?

    Larry Freeman aka DrDedupe

    (a NetApp Employee)

    1. Chris Schmid

      True and not true

      Larry,

      true and not true.

      1) "Scaling back" is not what we are doing because it would mean we treat every object in the same exact way. No, what we are doing is recognizing the contents (if you wish "interpreting" correctly the elements and objects there) and optimize them according to what they are. the result is a visually lossless file. If we were to scale back attributes, we would not be visually lossless.

      2) true, that is a customer comment. What is the true ratio for these kind of unstructured files with internally compressed content (PowerPoint, images, etc.)?

      3) It is penalty-free because you do not need a reader or any rehydration of an optimized file. The optimization itself requires performance, but only one time. Once the file is optimized the file is smaller and doesn't need to be rehydrated anymore by no application or system. And a smaller file is also loaded faster, so after the optimization less performance is required for handling that file.

      In general, our approach is totally different than dedupe. We don't look across files but INSIDE files to optimize capacity. By doing so, we create an open form of capacity savings and users can do primary dedupe and all other things in the same way after optimization.

      Best,

      Chris

  7. Chris Jakeman

    Showing Microsoft how to write software

    by reorganising the data and removing the bloat.

    Microsoft aren't interested in better products, only products with more features.

This topic is closed for new posts.