Is merging history supposed to increase the file size?

IMG_1738

This just came up on the Facebook group so thought I’d ask here.

Merging history is nearly doubling the file sizes. Is this supposed to happen? Does it have any adverse affects?

Thanks

Yes, it is supposed to happen. No, it doesn’t have any effect. When you export a design, it creates an archive of the data representation. The size of that archive has no relevance to anything.

1 Like

Thanks for the quick reply. I was worried that it might affect performance on large files but if it doesn’t then we’re all good.

I disagree with @istvan assessment, I believe it does. Especially as file sizes grow with parts and even with merged history. It now only takes a few added sketches and history steps, one or two lines in a sketch to begin to lag and think and make it unbearable again. Isolation mode still has issues with being much slower when direct modeling.

I’m frequently copying an object and modifying it to fit other parts, sometimes the changes aren’t working and I’ll delete it and start over with a new copy. I don’t particularly understand why when deleting an object, the history steps will remain and just have the yellow warning triangle vs being deleted as well?

Perhaps this is just a localized issue with large assemblies, if so I hope that can be stated clearly and that the team are looking into optimization for them. Bc for those of us who do work on large and geometrically complex assemblies, now neither the HBPM or collapse and direct model paradigm isn’t working well.

1 Like

While larger files can be slower, not every large file is slow, and not every small file is fast. There is correlation but no causation. When you collapse the history, that’s basically the same operation as exporting and reimporting. The change in the size of the file is due to the changed representation of the data. A parametric model is a sequence of operations, which does not take much storage, while the collapsed model stores the entire geometry. Storing the text “cube” vs storing every coordinate of every vertex and edge and face and surface.

1 Like

I understand not all file types are the same. Current project is lots of small very geometrically complex parts, lots of variants of each. It’s been ballooning the file size and degrading the performance as bad as prior. I just don’t understand how this performance degradation that never occurred prior to HBPM is still being denied as being related to it.

Attached is a screenshot showing the variations of the same file saved as x_t and native shapr pre and post collapse history. You say collapsing the history is the same as exporting and re-importing and has no relevance, but that’s not the case. The history for this wasn’t even terribly long, less than 200 steps.

Even with collapsing the history, the performance deg is so severe a less than 1,500 part file is performing just as bad as my previous file 3x the parts and the hidden autogenerated backup file.

Moving something, the first move is quick, the second can take minutes to apply. I’m also getting super weird click errors with the pencil when trying to select faces, even with pauses in between it will act as a double tap due to the processing slowing down. Getting all new errors i’ve never seen before when trying to create geometry, trying to loft and it gives me an error saying it can’t be named. And sketching that is so slow it renders the program useless.

We did introduce performance regressions with the parametric release. Almost all of them that we are aware of has been fixed since then, some more will be fixed in the coming weeks. However, the size of the archive has little to do with performance.

If you report a bug with a way to reproduce it, we always investigate it and normally we fix them in 2-4 weeks.

Again, Shapr3D needs to handle the history efficiently and seamlessly. Further burden is placed on the user to manage what the program creates as part of its design through requiring merging.

I have used other graphic intensive software with layers and history. It was not up to the user to manage the history as part of the routine use of the software. The user has options for managing it, but the software created it and made it work seamlessly. It did not bog down performance.

If Shapr3D can manage the history in an efficient manner and allow the user access and some tools to manage the history, it will become an awesome program once again. Others do it. What are they doing? Could Shapr3D developers come up with efficient solutions on their own? I guess this is the crux of the matter. Is it not?

My recommendation for creating a key field that identifies relationships between sketches and bodies may also serve as a means of automatically merging related sections of the history. (Please refer to my earlier reply where I describe this idea.) I think a lot of functionality could come from something like associating the sketches with the body.

I hope these are simply growing pains. I am taking the time to participate in this discussion because I hope Shapr3D gets past this pain, and still grows.

So I followed this gentlemen’s steps when discussing this on Facebook and this is the results I ended up with, and him obviously.

Boring video of the process.

Surely this is a problem to end up with the exact same object with a merged history but 4 times the file size?

Yes. It’s the exact same file size that you’d have with the old version if you created a cube. Parametric files are smaller. Why does this matter though?

Perhaps for step one just having a single cube, doing 7 other steps and still ending up with 1 cube but the file size is now 4x larger for the same geometry is what’s puzzling.

Sadly we cannot test the theory that’s being laid out here on the old version. That exponential changes and history are adding memory and taking up more processing power quickly ballooning file sizes and causing performance deg.

I think the question being posed is why does this archive exist if the history gets merged? If it balloons file sizes and causes performance issues, it very much matters.

1 Like

You said that file size has little to do with performance. Does that little bit not keep adding up the more and more times people merge history? I just followed the above steps 3 more times and the cube is now 10x the size.

What I’m really curious about is whether people need to be aware of this when they’re modeling. Both to keep expectations reasonable and to model efficiently. Is merging over and over again going to cause issues with larger files?

To be absolutely frank, performance issue reporting was ignored when this first began so I’m trying to be thorough when spotting things that might negatively affect users.

No, it does not matter. I’ll now try to go into the technical details of how a CAD system works to explain why it doesn’t matter, but it’s a bit beyond the scope of this forum.

As a start, I recommend reading this article: The technological foundations of CAD software

In a parametric CAD system, under the hood what you need to store is a sequence of the operations, that looks something like this:

sketch (list of lines arcs constraints etc)
extrude (references to the sketch)
fillet (references to the extrusion)
etc.

This takes up very little storage. The output of these operations is a boundary representation (b-rep) data, which is the actual geometry.

The b-rep data stores all the faces and the edges (topology) and the surfaces and the curves (geometry).

A single spline surface can take up more space than the entire parametric history of a complex geometry, because every point, every parameter, every piece of geometry has to be written into the file. That takes up a ton of space.

In a direct modeling tool, the only thing that you store is the b-rep data. That takes up quite some space. In a parametric modeling tool, you don’t need to store the b-rep data, because that’s the result of the parametric history, which is stored. You can recalculate the b-rep data any time. Currently we are not storing the b-rep data, only the parametric history, that’s why the load time can take longer than before, because we re-run the entire history when you open the design.

When you merge the history, we delete the parametric history, and save the end result of the execution of the history, which is the b-rep data. This is the exact same data that we stored when Shapr did not have the hybrid parametric-direct engine. We just stored the b-rep data.

Now what we store in the underlying database of Shapr3D is completely irrelevant for the runtime performance. Eventually we’ll store the b-rep data too, because that way we can load a design faster, because we won’t need to recalculate the parametric history when you open the design.

When you export a .shapr archive, that’s a zipped snapshot of many different files and databases that make up a Shapr3D design. That being said, the archive’s size is even less relevant to the underlying file’s size which is also not relevant to anything related to performance. The .shapr archive is not the file format of Shapr3D, because we don’t really have a “file format” in the traditional sense, as the underlying representation is stored in a database and a bunch of different files, that’s why you don’t need to save your designs manually, and that’s why it’s extremely hard to lose data, because even if the application crashes, the underlying database will stay intact.

Because of this if you merge the design, you’ll end up with a similar archive size to what you’d have in the old version with the same design. The “pure” parametric archives will be smaller, because they don’t have the geometry baked into it. This will change at some point of time, and we’ll save the geometry in the database, to decrease load times. Then the archive’s size will be bigger.

6 Likes

That’s an interesting read, thanks for that I learned a lot.

I get that this probably isn’t the time or place for a lesson but using your example from direct modelling:

1. A circular profile a solid cylinder, with two planar faces on the bottom and the top 10 mm away from each other, and one cylindrical face connecting the bottom and the top.

When you merge the history several times and after many steps and deleting several objects. if you’re left with the original cylinder, why is the file bigger if it’s just saving the b-rep data again as above?

I understand now that parametric data is smaller because it’s basically the recipe whereas b-rep data is the baked cake (I think that works?) but I’m not understanding how going through loads of steps makes the file larger if all those steps are then deleted and your left with the exact same cylinder that you began with.

What is the additional data in the background that has increased the file size (despite it not affecting performance)? I’m presuming that it’s not b-rep data because there’s no additional geometry. Is it just the data that was deleted but stored away? Like if you put the recipe in the bin but the bin is still in the kitchen in case you need it for later?

That’s an implementation detail of the underlying database system (SQLite) which sometimes increases in size and sometimes it shrinks. If you are interested in the details of how an SQLite database works, it’s a fascinating, fully open source project, you can learn more about the technical details on ther website.

1 Like

I’ve used SQLite for years and understand everything that you’re talking about, but what a lot of us are experiencing doesn’t jive.

After the first parametric update, one of my projects started talking 44 seconds just to open. Isolating and un-ispolating was very sluggish. If I export that project in Shapr format, the file size is 136mb. If I export it in x_t format, it is 1.9mb.

When I import the x_t file back in to a new project, it’s fast again. If I export it to shapr format, the size is only 1.2mb, and that one opens fast too. It wasn’t even a complex project, containing only about 35 bodies and maybe a dozen sketches.

Causation vs correlation and all that, but something definitely went south on that initial release.

1 Like

This can be all true, and we did introduce performance regressions with the parametric release, particularly with big assemblies. Most of the issues are fixed already, those that are not will be fixed with in the next few weeks. But none of that is related to the size of the archive.

2 Likes