[spine-cpp] 3.7 Beta Branch discussion of C++

Jjpoag · 2018年6月26日

Hey guys!

This morning a task came across my desk to update to the latest 3.6 runtimes. I was also asked to take a peek at the C++ sitting in the 3.7 Beta branch.

First, shout out to sgowen (Stephen) for taking on the task. We've been wanting a C++ branch for a while now... Cheers!

Also, mad props to badlogic, there are some seriously awesome features. This is quite possibly the least painful Spine upgrade I've ever done. Thank you!

So I started looking through the source to see what's-what and there were some weird things that stood out. First was in this changeset where you got sick of writing new (__FILE__, __LINE__) everywhere.

The same changset has delete binary added in right before an exit(0); Sure, that's in an example, but it's relevant.

Next was Vector. I saw that you were rolling your own and that it derived from SpineObject. String too. At first I thought perhaps you were trying to replicate boxing/unboxing like in C# or Java, but really SpineObject has default new/delete operators that forward the calls to the extensions (for memory tracking).

SpineString really got me though.. were you trying to avoid STL? Weird because it #includes the <string> STL header...

Then it occurs to me that you are taking Herculean efforts to avoid memory leaks and allow for 3rd party memory leak tracking. Cheers for that. We really appreciate the attention to detail.

Proposal: Don't. Don't use new, don't use delete. Don't rewrite standard STL containers.

Use C++11 instead.

C++11 has been around for 7 years (or more). For reference, Java 1.8 was released in 2014.

I'm well aware of cross-compiler support for various features. Everyday I have to check my code against 4 platforms using 3 different compilers, one of which is MSVC 2010. I can attest that MSVC 2010 has all the features you need save one, std::make_unique. However, that can easily be replicated.

If you conform to RAII, use shared_ptr, weak_ptr, unique_ptr, make_unique, make_shared, std::move, etc... then you won't need to track memory allocation at all. Just forbid new/delete at all in your code. Use Modern C++ code that is easier to maintain.

Let's go back to the delete binary; example from earlier. The reason this line needed to be added was because binary wasn't allocated using RAII techniques. E.g...

SkeletonData* readSkeletonBinaryData (const char* filename, Atlas* atlas, float scale) {
   SkeletonBinary* binary = new (__FILE__, __LINE__) SkeletonBinary(atlas);
   binary->setScale(scale);
   SkeletonData *skeletonData = binary->readSkeletonDataFile(filename);
   if (!skeletonData) {
      printf("%s\n", binary->getError().buffer());
      delete binary;
      exit(0);
   }
   delete binary;
   return skeletonData;
}

Using RAII, would be...

std::shared_ptr<SkeletonData> readSkeletonBinaryData (const char* filename, Atlas* atlas, float scale) {
   auto binary = std::make_unique<SkeletonBinary>(atlas);
   binary->setScale(scale);

// readSkeletonDataFile should return unique_ptr or shared_ptr
   auto skeletonData = binary->readSkeletonDataFile(filename);
   if (!skeletonData) {
      printf("%s\n", binary->getError().buffer());
      return nullptr;
   }
   return skeletonData; // promoted to shared_ptr
}

In the second example nothing is leaked, even if an exception is thrown.

This post is getting long, so I'll save other points for further discussion.

In any event, are there any other improvements that you guys can think of? We're super excited to finally see a C++ runtime! Let's take a beat and make it easier to develop and maintain!

TL;DR: Forbid the use of new/delete in cpp version. Use C++11 instead. Then, drop memory leak tracking in spine-cpp.

Mario · 2018年6月27日

Awesome! I was really looking forward to feedback like this. Let me go through it point by point.

First was in this changeset where you got sick of writing new (FILE, LINE) everywhere.

I actually did not get sick of writing that . But up until that point, I needed to make sure that all allocations (inside the runtime, and within example code) can be properly tracked to rule out memory leaks. For end users, requiring to specify file and line number would be too tedious (but they still can).

If you have a look into the function right above readSkeletonBinaryData() called readSkeletonJsonData, you'll see that RAII and stack allocation is not a problem. In fact, all classes in spine-cpp use RAII.

SkeletonData* readSkeletonJsonData (const String& filename, Atlas* atlas, float scale) {
   SkeletonJson json(atlas);
   json.setScale(scale);
   SkeletonData* skeletonData = json.readSkeletonDataFile(filename);
   if (!skeletonData) {
      printf("%s\n", json.getError().buffer());
      exit(0);
   }
   return skeletonData;
}

I've refactored the SFML example to use a style closer to C++11, including poor man's std::make_unique replacement (which is C++14, not C++11). See spine-runtimes/main.cpp at 3.7-beta-cpp

This refactored example code does not have any new or delete invocations (except for the make_unique emulation). User side, it is thus possible to use the spine-cpp API together with your preferred memory management strategy, be it smart pointers, or more traditional means of shooting yourself in the foot. Stack allocation also works as intended due to all spine-cpp classes using RAII.

At first I thought perhaps you were trying to replicate boxing/unboxing like in C# or Java

Eh, I'm not that deep into JVM stuff. I'm especially not trying to replicate one of the worst features of it in a language with proper value types .

SpineString really got me though.. were you trying to avoid STL? Weird because it #includes the <string> STL header...

That should be #include <cstring> actually! Good catch. Read on for reasons to have our own string and vector implementations.

Then it occurs to me that you are taking Herculean efforts to avoid memory leaks and allow for 3rd party memory leak tracking. Cheers for that. We really appreciate the attention to detail.

TL;DR: Forbid the use of new/delete in cpp version. Use C++11 instead. Then, drop memory leak tracking in spine-cpp.

Taking a step back, I think these are the questions that we actually need to answer:

1. Does switching to smart pointers in the internal code and public APIs of spine-cpp increase maintainability and better abstract ownership semantics?

I do not believe that is the case. All of spine-cpp's code follows RAII, so ownership semantics are pretty clear, and memory management is quite simple as well. As illustrated with my SFML refactoring above, user code can decide to use smart pointers or avoid them. Where we to use smart pointers in the public API of spine-cpp, we'd force everyone to use them. A lot of our customers
(and friends in the industry who I've passed some design decisions by) are averse toward smart pointers, based on a history of projects extensively using them, only to find out that it's still to easy to write buggy memory management and also observe some performance trade offs due to the compiler not being sufficiently smart to inline/eliminate most of the magic.

So, given that the public API of spine-cpp allows end users to choose their poison, and given that ownership semantics in the internal code are straight forward, I would want to keep that aspect as is.

2. Is there a benefit of allowing end users to plug in their own memory allocators and trackers?

Smart pointers are great most of the time, but do not prevent you from shooting yourself in the foot. Once you have shot yourself in the foot, being able to easily track down what exactly killed you is important.

The "herculian" task was straight forward affair of creating a handful of sensible new/delete pairs, making sure (potentially) heap allocated types derive from it, and annotating any internal allocations to track the file and line number where the allocation happened. This has helped tremendously in ensuring 1. we are as allocation free in inner loops as possible and 2. there are no leaks.

You can do leak detection with valgrind and sanitizers that come with Clang/MSVC++, and Instruments (and I think VS now too) allows allocation hotspot detection at runtime. However, these tools are not available on each and every platform (screw you Android).

The answer to this question in my opinion is thus: yes, providing pluggable allocation and tracking is indeed valuable to both us as the maintainers as well as end users who might target platforms we haven't even thought about (one such case being set-top boxes!). Having to annotate allocations inside spine-cpp is a very minor price to pay for that in my opinion.

3. Does switching from our custom SpineString and Vector to std::string and std::vector make custom memory allocation and tracking harder?

I believe the answer to this is a yes as well. Our custom allocation scheme is very simple to plug into and does not affect the existing code base, whereas allocators for collections and other STL classes have to be specified at the declaration site of a variable/member. It would also be impossible to track allocation sites as we do now, as we are not in control of the internal code of STL classes.

4. Does switching to std::string and std::vector have a performance impact?
In general, switching to std::vector would probably have no perceivable performance impact compared to our current default Vector implementation. However, end users can switch out the allocation strategy more easily as outlined in 3. and thus better combat fragmentation, do region management, or perform other cache friendliness optimizations more easily. Specifically in SkeletonJson and SkeletonBinary, where time lines and meshes are allocated who can benefit from a simple bump allocator.

A naive example: during development, you can use the default allocator and track the change in required memory around a loading call to SkeletonBinary or SkeletonJson. You can exactly figure out how big of a memory block is needed to store all the loaded data and switch to a bump allocator for production, increasing cache coherency and thus performance. The alternative solution for better cache coherence would be to completely rewrite the object graph model of the Spine API, something that we'll likely not do anytime soon, especially if such a simple solution exists that's generally applicable irrespective of the chosen object graph model.

In case of std::string, the answer (based on our current code-base) is yes. std::string instances have a tendency to multiply like rabbits when being passed down or constructed from plain old char*, const & not withstanding. This is especially observable in loading code like SkeletonBinary and SkeletonJson. The custom String implementation allows us to have move semantics on the backing char*, and thus lets us avoid some unnecessary allocations at load time.

I hope my justification for design decision doesn't come of bad. I REALLY appreciate that you took the time to write up your feedback, and I hope my reply shows that I sincerely considered your proposal. Based on my analysis above, I do however believe that we currently have a pretty good balance that should allow all our users to use spine-cpp in the way the are most comfortable with.

Jjpoag · 2018年6月27日

Hey! Thanks for the reply! It will take me a while to read through this and follow your points... back in a few!

1. Does switching to smart pointers in the internal code and public APIs of spine-cpp increase maintainability and better abstract ownership semantics?

badlogic ：
I do not believe that is the case. All of spine-cpp's code follows RAII

My apologies, I wasn't trying to imply that spine-cpp didn't already use RAII, I guess the one example was really the only case. I should have looked harder for more examples instead of just looking at change sets.

badlogic ：
A lot of our customers
(and friends in the industry who I've passed some design decisions by) are averse toward smart pointers

Ok, so I've been programming C++ for 20 years, 10 years professionally. I remember the first smart pointers to come out.. auto_ptr ...and they were awful. There were quite a few errors in the contract, ones that couldn't be fixed until the C++ language was extended. Specifically, move semantics (rvalues), deleted methods (copy constructors), (atomics), etc.. came along to help tie up the loose ends and fix the smart pointer problem.

Now, if you try to use a smart pointer incorrectly, you will get compiler errors that notify that you are violating the contract. Runtime errors are objects being deleted too soon, something that's harder to do but easy to debug.

Being an old dog myself, I understand the aversion to using new-fangled features, even if they have been around for 7 years and aren't going anywhere.

Ugh.. I feel like this is devolving into a discussion about smart pointers. Forget smart pointers.

How about switching to global new/delete overriding for only spine-cpp source code? Check out this code I wrote for the spine-cmemory testing. It can be switched on and off for performance improvements. The only requirement is that it is the last include in spine-cpp/source files. No more annotations and if you include it in every .cpp file then you will automatically get tracking for all future code.

2. Is there a benefit of allowing end users to plug in their own memory allocators and trackers?

It's kinda funny because I do use valgrind, Instruments, the VS inspection tools as fallbacks for leak detection. Would it be simpler to say "Hey, you want to plug in your own tracker? Just use industry standard detectors that have been around for a decade (or more)" and eliminate a LOT of code. Less code, fewer bugs.

Perhaps only do memory checks for internal consistency? Like during the CI phase?

3. Does switching from our custom SpineString and Vector to std::string and std::vector make custom memory allocation and tracking harder?

Yeah, don't do any tracking with std::string or std::vector. Like, none. So easier. Less code.

4. Does switching to std::string and std::vector have a performance impact?

Who said anything about performance? Sure, doing things like reserve or resize ahead of time will increase allocation performance, but I was never thinking about performance.

Don't re-invent the wheel. std::string and std::vector are known good implementations. Not only that, but they've been around long enough that many, many libraries use them and lots and lots of people automatically know what they are, how they work and (if needed) how to convert them into whatever they need.

How about a compromise? Internally, use std::vector (gist) and std::string (gist). Those versions have significantly less code and more features.

Summary
The less code you write, the less that can go wrong. The less code you have to debug the faster you can provide fixes. There's elegance in simplicity.

What I'm suggesting is that you lean on Standard Libraries to provide well tested, widely accepted implementations instead of rewriting everything yourself.

Perhaps to that end you don't need any leak detection in Spine-Cpp. Do you have leak detection in Java? C#? AS3? JS (TS)? Objective-C? Sure these things have Garbage collection, but isn't Garbage collection orthogonal to the code? The same way instruments, valgrind, etc.. are orthogonal?

Nate · 2018年6月27日

Ultimately what matters are the things that affect the runtime's users. Maintainability is fine to consider, but we are really not talking about a lot of code here, there are good reasons for it, and we'll be the ones doing the maintenance rather than the runtime's users.

Mario · 2018年6月27日

I hope you don't take this the wrong way, but I feel like you largely ignored my reasoning.

We are not ignoring the well tested code of STL because we hate code re-use. We have very specific reasons for why we do not use them, which I hope I have outlined in my first reply. Please see them in my post above.

That some customers are not big fans of smart pointers is merely tangential, and not the reason we roll our own implementations.

Onwards to your suggestions:

How about switching to global new/delete overriding for only spine-cpp source code?

We should consider this. I will have to juggle a few APIs around.

Less code, fewer bugs. Perhaps only do memory checks for internal consistency? Like during the CI phase?

Yes, using the tools available on various platforms is generally the way to go. But as I said, some platform tools are worse than our custom tracker, or have no support for such tools at all (set-top boxes). Also, we are not talking about thousands of lines of code to enable this custom tracking. It's almost exactly 200 LOC, and all of that is essentially just a facade. I'm OK with maintaining that.

Yeah, don't do any tracking with std::string or std::vector. Like, none. So easier. Less code.

Again, there is a reason why we need tracking here, as I outlined in my previous post. Specifically, tracking at runtime, not for leaks, but for the question "how often and where do we allocate".

How about a compromise? Internally, use std::vector (gist) and std::string (gist).

That demonstrates that the current design is extremely easy to fork and switch over to STL, no? Do you feel the changes to these two files are burden for the end user should they prefer to use STL? Conversely, how easy would it be for STL averse end users to switch things around? Also, you did not comment on my reason for custom strings: custom ownership semantics of char* to avoid heap allocations at runtime. That is not something I can do with std::string.

The less code you write, the less that can go wrong.

Again, I 100% agree. But you yourself demonstrated that switching to STL is trivial. And the additional code for tracking is also only 200 LOC, which I'm happy to maintain for end users that want or need this mechanism. With your proposed change, end users that want to remove tracking should have a very simple time doing so.

Jjpoag · 2018年6月27日

Ok, thanks guys for hearing me out.

I appreciate your time.

Mario · 2018年6月27日

jpoag, I really do not want to kill the discussion. As I said, your feedback is super valuable, and your contributions have also been of immense value, especially the test harness for spine-c. I'm not dismissing your arguments, I think you are correct about most of what we are discussing when we view it within the context of an end user that wants to have interoperability with STL, and wants to remove any superfluous code like the tracking for which they have alternatives on their target platforms.

But we also have to take into account use cases of other end users that have different requirements. I'm not sure how to better provide both camps with what they want. So this is an area of discussion that I'd like to focus on more.

I'm happy to make the proposed change of switching to a macro for internal allocations/deallocations. That will enable tracking to be completely removed (though that'd also mean a change in signature of the methods of the Extension class, which I'm not sure how to handle cleanly). So I guess this gets us a step closer to your ideal solution.

Now, how to deal with Spine's own Vector and String implementations? I'm reluctant to propose a #define way of switching between the two implementations, as that would indeed increase the maintenance burden. I'm also unsure how to deal with the requirement of having a string take ownership and manage the life-time of a provided char*, which we need to ensure no runtime allocations happen. I don't see how we can achieve this with std::string. Do you have another idea how we could achieve this?

Jjpoag · 2018年6月27日

Well, no. You've made good arguments. Especially with the "how often and where do we allocate" requirement. Honestly, I started thinking about other ways of doing it and at some point I realized that 1) it was getting complicated and 2) maybe I was just trying to think of something different just to be different. That's not really progress, that's just arguing to argue. I don't want to be that guy.

So yeah, context is important and it seems like you guys have good reasons for doing what you are doing.

And, just like having a reason for doing something you have have a good reason for not doing something (YAGNI). My coding philosophy is Simple, Elegant. If given the choice of adding preprocessor directives to switch code, I would lean against it and would rather have SpineString and Vector as they are. Less code.

badlogic ：
I'm also unsure how to deal with the requirement of having a string take ownership and manage the life-time of a provided char*, which we need to ensure no runtime allocations happen

This is a part of what I was thinking about when I realized it would be more complicated. My first thought was something similar to how pugi_xml handles in-situ parsing. Pre allocate an entire buffer for all strings and use something similar to std::string_view. Then I remembered one of the SpineString constructors I so callously deleted did exactly that. Exactly. That.

So please, don't complicate SpineString. Keep it as-is.

As for other suggestions? It's been a long day on my end.