Showing data from file

Hello,

We would like to use SciChart to show data from a binary file (up to 1 GB!).
What would you suggest as the best method for achieving this?

Thanks

Zoe

More details:
The binary file contains for example the following data: a1b1c1a2b2c2a3b3c3…
where a1 b1 c1 etc are double byte numbers.
In the chart we would like to show 3 series:
serie a: (1,a1) (2, a2) (3,a3) etc
serie b: (1,b1) (2, b2) (3,b3) etc
serie c: (1,c1) (2, c2) (3,c3) etc
Hope this is clear…

WPF

zoe asked 11 years ago

You must login to post comments

Answers (1)

Hi Zoe,

A question for you – how many rows are in your file? SciChart has been tested up to 10M points (by us) and around 20-30M points (by users). It performs well for line series with thickness=1. Scatter series, mountain etc will not cope with millions of points like line does (that will change in the future, as we investigate alternative rendering technologies!)

Secondly, do you need to view all points at the same time? If so, the most efficient method to load 1GByte text file will be to load in blocks using Append(Ienumerable) that Yuriy has pointed out. Consider a block size of say 4096 points. Load that continually until the file is exhausted. SciChart will keep a copy of your data in its DataSeriesSet. So, if you can stream the data in blocks to SciChart without keeping yet another copy of the entire file in memory it will help.

If you don’t want to display all the points (say you want to virtualize a file on the screen), then we need to talk as its a bit more complicated. We have a powerful API to extend SciChart (see the ChartModifierBase class) but we don’t have any examples for this exact use-case.

Best regards,
Andrew

Andrew Burnett-Thompson answered 11 years ago

- zoe
- 11 years ago
Thanks Yuriy and Andrew, Yes, we need to show all data at first, then allowing the user to zoom-in into parts of it. I started implementing it as you both suggested, and indeed I don't keep another copy of the data, just load it directly into your data series (in chunks). In theory we would like 1 milliard points... But, we know this is challenging. Because of course, we don't want all the points to be loaded in memory... What we would like to happen (this would be the best), is giving scichart the file, with some way of describing how the data is organised in the file. Then let scichart do the sampling on the file. We're ready to pay the price of having the user wait a while when zooming-in, while scichart loads more points from the file for zooming... Would you consider adding such an extended feature? Another option, is doing the sampling on the file our-self, and loading the sampled version into memory. But then we will have to hook into the zoom events, and load appropriate data, each time the user zooms-in or out... Thanks and have a nice day! Zoe
- Andrew Burnett-Thompson
- 11 years ago
Hi Zoe, I've been thinking about this and I think the only way to do it would be as follows:
- Read the entire file in chunks into a separate buffer called LowDetailBuffer. Literally read in 1000 points at a time, find the min, max in that region and append one point to the low detail buffer (or use your own simplification algorithm). This coarse resampling is done external to SciChart.
- Now fill SciChart with the low detail buffer
- Next, inherit a class from ChartModifierBase (or inherit from RubberBandXyZoomModifier). Override OnModifierMouseDown, OnModifierMouseUp, OnModifierMouseMove and intercept the zoom operation.
- Finally, with the zoom operation intercepted, decide if you want to continue showing data from the low-detail buffer (in which case just zoom to a new VisibleRange), or, reload the dataseries with new data from a high-detail buffer (stream from file).
It's not an easy problem to solve, but I think it can be done. Basically rather than let SciChart handle all this, I believe you will need to handle the resampling externally and just use SciChart as a front-buffer, capable of displaying up to (but no more than) 10M points. In the near future we'll be improving our rendering speed to handle more data, however, eventually you hit a memory wall! So, virtualization is definitely the way to go. Best regards, Andrew
- zoe
- 11 years ago
Hello Andrew, Your solution is definitely an acceptable one. I would like to try implementing this, as part of our evaluation. I have a few questions: 1. Is there a difference between inheriting from ChartModifierBase or from RubberBandXyZoomModifier. In which case should I use which one? 2. What exactly should I do in the overrides. in order to load the new series. Sample code or even pseudo code would be greatly appreciated! I'm still at the beginning of the learning curve... Thanks a lot!!! Zoe
- zoe
- 11 years ago
Hello all, More thoughts about this issue... Let's say we have a pc with 8GB ram, and it's OK to use 1GB of ram for our application. What will happen if we load a dataseries of 1GB points into scichart? From what you wrote, I understand that it will get slow rendering, isn't it? So your suggestion is to do the sampling outside of scichart, so scichart will do fast rendering. But then when zooming, or moving around the graph, anyway rendering will be slown down, because we will have to refill the dataseries. Anyway, there will be a time penalty... Have you been doing some performance testing to see which solution has less time penalties? Because anyway you have some sampling implementation in scichart, and I suppose that a lot of effort has been put into it to make it smooth and fast. So maybe it's better than to do our custom sampling on the file, etc... I will try to load a 1 GB series into scichart today. And see what happens... But anyway, any input of yours would be greatly appreciated! Zoe
- Andrew Burnett-Thompson
- 11 years ago
Hi Zoe, Try it by all means, the most we've had in there is around 40,000,000 double/double points (so I guess that's 640MBytes when in memory) - and it worked! Your 1GB file will not necessarily be 1 GB in memory - how many rows are in the file? Is the file binary or text? Yes, our resampling algorithms are about as optimised as they can get for large data. The issue is however, CPU cycles & memory bandwidth. Reading 1GB of data, no matter how fast your CPU, will have an associated latency because the memory bandwidth on modern PCs is at most 10GBytes/second - and that's when a block of memory is read sequentially. Re-filling will have its own penalty but I'm thinking at most a couple of seconds, not many seconds or minutes of freezing. Do try it - in the meantime, we can try this on our side, but I need to know how many rows you have in your data-set (how many points) and what sort of data. Is the X-data sorted? Is it equally spaced? Is Y-data sinusoidal? Random? etc... Best regards, Andrew
- zoe
- 11 years ago
Hello Andrew, Here's some details about our data. The data is binary data. It is collected from hardware collectors at regular intervals and written sequentially to the file. The file itself can be a small file, but we would like to support up to 1GB of file size (for now, in the future maybe even more...). So the x-axis is not in the file. It's just a series of numbers 1, 2, 3... that we add as we read the data from the file. So it's sorted and equally spaced. The y axis should be treated as random data. It can be of various type, from single bits up to arrays of 256 bits (so it could also be bytes, int32, int64 etc). According to this, is the number of points/rows in our 1GB file. So in the worst case, where all the data points in a 1GB are single bits, we need 8 billion of points on the chart... That might be just a little bit too much? But in many other cases, we will need much less. Hope it's clear... Thanks a lot Zoe