XyDataSeries perfomance, custom IXyDataSeries, and alternatives

My group is evaluating SciChart for high performance realtime charting. We are testing line renderer with XyDataSeries.
We were able to get good performance, but we need to squeeze more and our scenario is a bit different from how XyDataSeries is used.

To make a long story short, we cannot append, because our application has 2 strict requirements:

every data refresh we need to ditch the whole dataset and replace it
with a new (usually larger) one (no append)
we need to display each refresh immediately, even if it means delaying user input (so Immediate. Or Manual with a refresh after
every step).

which means put in XyDataSeries a new set of points at every step. This means either do:

var dataSeries = new XyDataSeries<double, float>(samples.Length)
dataSeries.Append(domain, samples);
m_renderableSeries.DataSeries = dataSeries;

or :

dataSeries.Clear();
dataSeries.Append(m_domain, samples);

(btw, the first one is slightly faster, 190ms vs 240ms to draw 10 million points)

Which is obviously working against how XyDataSeries is implemented.
A faster way would be to just do

var dataSeries = new ReadOnlyXyDataSeries(domain, samples);
m_renderableSeries.DataSeries = dataSeries;

Where ReadOnlyXyDataSeries just takes the samples array and without any copy makes it available to the renderer.
So I implemented ReadOnlyXyDataSeries as a IXyDataSeries<double, float>.

To my surprise however this performs much worse (900ms to draw 10
million points), while it should perform better (it is really just a
XyDataSeries without copy)

UPDATE: This is not true: I was setting IsSortedAscending = false on sorted data. Once I put it to true, preformances are back to exactly the same performance as XyDataSeries . Which is good but not stellar.

(hack time) I know there is room for performances, because I made the following (hacky, brutal, very bad) thing:

var internalList = (ISciList<float>)dataSeries.YValues;
Array.Copy(samples, internalList.ItemsArray, samples.Length);

instead of the Clear/Append pair, and it is much faster (30ms are shaved off). No copy should be even faster.

Obviously I am missing something. So, what am I missing? How should I implement a custom IXyDataSeries in a fast way? Is there another way?

Version

5.4

WPF XyDataSeries Performance Custom

Lorenzo Dematte asked 5 years ago
last edited 5 years ago

You must login to post comments

Answers (1)

Good morning Lorenzo,

Again my sincere apologies for our tardy response on this issue. I’d like to know why you need to throw away the whole DataSeries, however I appreciate that you do and will answer your question as best as I can.

The reason the first method is faster (DataSeries.Clear() than DataSeries.Append()) is internally the memory is not recreated. The memory is cleared but not recreated.

In the second method, new DataSeries(), you are creating a 160MByte buffer (10M points * 2 * 8 bytes) and garbage collecting another one. Despite the operations looking very similar, under the hood there are optimisations when you clear a DataSeries to re-use pooled memory.

When an analysing performance and ‘how can I improve it’ its important to understand the physical limitations of the computer. The memory bandwidth in a modern PC is about 16Gbytes / second. 10Million points of double/double data is 160MBytes, therefore it will take a minimum of 10ms to copy 160MBytes of data into memory, giving you a maximum frame rate of 100FPS. Don’t forget we have to read back this 160MBytes to draw it, so that’s another 10ms dropping the maximum frame rate to 50FPS, unless we do caching internally (which we do if data not changed). This is a physical wall and cannot be circumvented. There is no cheap or easy trick to make the computer outperform memory bandwidth.

So given this, how does DataSeries.Append() work under the hood? If the memory is already allocated (it is, if you clear() – values are cleared but memory is not destroyed, it is reused) then Append() simply fills those buffers again, however it does a little bit more than that. We internally inside DataSeries calculate the distribution of your data. We check if it is sorted ascending, we check if it is evenly spaced. We check if it contains NaN or null or empty points. All of these flags are required for later optimisation of drawing: by knowing the properties of data in advance we are able to select the best performing drawing algorithm for your situation: rather than apply a hammer to everything, we choose the right tool.

Now it is possible to hack (override) the Data Distribution calculation and specify those flags yourself. If you know the distribution of the data, you can tell SciChart and avoid this calculation entirely, making DataSeries.Append much faster. There is an example of this here.

The DataSeries.DataDistributionCalculator is a class which determines
the distribution of your data (sorted in X or not, evenly spaced in X
or not) and the flags are used to determine the correct algorithm(s)
for resampling, hit-testing and indexing of data.

By default this all works automatically, however, if you want to save
a few CPU cycles and you know in advance the distribution of your
data, you can override the flags as follows:

var xyDataSeries = new XyDataSeries<double, double>()
{
    // use this feature to override Data-distribution calculation and
    // provide your own flags to save CPU cycles. Only use if you are certain about the
    // distribution of your data
    DataDistributionCalculator = new UserDefinedDistributionCalculator<double>()
    {
        IsSortedAscending = true,
        IsEvenlySpaced = true,
    }
};

If you would like to discuss further your requirements and how to squeeze precious CPU cycles out of SciChart, I’d welcome you to get in touch with us by using the http://www.scichart.com/contact-us page. We also have a project in-progress now (ETA early September) which is a total performance overhaul, which may help your project get the best from SciChart also.

Best regards,
Andrew

Andrew Burnett-Thompson answered 5 years ago
last edited 5 years ago

- Andrew Burnett-Thompson
- 5 years ago
- 1
Hi Lorenzo, got it. DataSeries is also array backed, and you found the array via the X/YValues.ItemsArray property. You are free to manipulate this directly if you want (we also do it). Just be advised that (1). X/YValues.ItemsArray may be larger than DataSeries.Count. Always use DataSeries.Count when iterating in a loop. (2). secondly updates to X/YValues.ItemsArray won’t trigger recalculation of DataDistribution or the chart to redraw. You’ll have to do this yourself. You can’t at present create a DataSeries which wraps your own Array – it has been requested as a feature request. The reason is there are so many other calculations & optimsations we do it could just cause race conditions or other errors to wrap an array.
- Andrew Burnett-Thompson
- 5 years ago
- 1
However … if you really need this, I can ask the dev team to look into it. We can carry out custom work for customers of SciChart so that they can get the best from our library. Do you want to talk to our sales about this? I can get someone to call you or email to discuss requirements. Best regards, Andrew
- sheldon vestey
- 5 years ago
Hi Lorenzo, you can get in touch via [email protected] or personally to myself at [email protected] to discuss custom work if you think your project would benefit from it. We also have a feature request system https://www.scichart.com/feedback-policy/ which customers can use to request features. As Andrew says, we are also working on a performance improvement which we can outline to you as well to see if that would benefit your project.
- Lorenzo Dematte
- 5 years ago
Thank your for your answers, very helpful. At this stage for us is not “we really need this”, but more like “can it be done”? We are knowningly pushing the boundaries to understand the limit and see if it can fit right now, it’s not sure we will need all the optimizations and options we are discussing.
- sheldon vestey
- 5 years ago
We think that what you want can be done but it’s a non-standard use of our library and would require some developer time to investigate and do a feasibility evaluation. We have a strict no mis-selling policy at SciChart so we always undertake full due diligence and if the task cannot be done we do not expect a sale. However, with all non-standard work, we would need to discuss order size before we can expend developer time! Let me know your thoughts and feel free to send me an email to discuss it further.