Crystal Reports Bursting vs Data driven jobs
Report Bursting is a process introduced by Crystal reports in the 90s, which allows you to create a single report with grouped data and export a separate file for each group.
Data driven subscription is popular within SSRS developers. The user defines a query, which will return some records. The software will loop through the records and will run a report (or multiple reports) per record. The software will use the record values to set report parameters, exported files names, emails, etc.
The end result (on the right) from both processes will be the same. There will be 3 PDFs and 3 Emails, but with bursting you need to run one report for the whole batch, while with data driven you run 1 report per record (3 reports in the example above).
Which one will be better for you? Data driven job is a clear winner for me.
Here are a few things to consider:
Performance and Scalability:
The bursting process will be about 10% faster than the data driven process for small reports, but the result might be the opposite for big reports. This is because bursting is parsing the reports’ data locally, which works fine for small reports. However, if you have a report with 4000 groups and 2000 records each, bursting job will need to download and parse 8 million records. The local machine may not have enough memory to compete with a database server. In the same scenatrio data driven will download 2000 records each time and filtering will be done on the server. A database server is much better at filtering data than the local Crystal reports libraries.
This comparison is valid for a simplified scenario when you run bursting and data driven jobs as a single process. For bursting, this is always the case since the execution is handled by the Crystal reports engine and it is a single thread (process). You can just run it and wait to finish. However, a data driven job is processing a set of records and could be optimized to use multithreading and run multiple records simultaneously. Modern processors have multiple cores and bursting will be able to use just one of them while a data driven job will be able to use the full power of all cores. If this is not enough, the recordset returned by the “driving” query could be separated in parts and each part could be sent to a separate machine for processing. Processing a data driven job even on a single computer with just 2 cores will outperform bursting by at least 50% no matter the size of the report.
Let’s say you have a report showing orders for the USA. Then you receive a request to run this report just for 1 state, 3 cities, a county, set of ZIP codes, a time zone, phone prefix … there are unlimited possible scenarios and sooner or later you will need to create copies of the report just to be able to filter your data. The report visually will show the same data for each order but with bursting you will need to include different orders in a single report and you will need to create multiple reports to handle filtering. But what happens if you need to change something? You will need to apply the same change to multiple reports and the chance to end up with different reports is pretty high. In addition, you will need to test multiple reports. However, for data driven jobs, you only need one report. Since the “driving” query is outside the report you can create many queries and variations of the job without changing the report. There will be one report to change and test if you need to apply changes. That being said, maintenance cost with data driven jobs is lower (sometimes much lower) compared to bursting jobs. Data driven jobs are a good example why it pays off to comply with Separation of concerns principle.
With data driven jobs, you will be running a query, looping through the records one by one and run one (or more) reports. You have various options here. For example, based on a value in a field you can choose to run different report for each row. Maybe you have a list of products and you want to run one report for machinery and another report for food because they have different properties. This is possible with data driven reports because you can get information about the record before to run the report and you can choose which report to run. With bursting, in order to get any data, you need to run the report first. There is no way to switch to another report after that and you are restricted to a single layout. Even if you try to use subreports there will be restrictions because of lack of support for nested subreports in Crystal and you will need to run multiple tests each time you add or change a subreport. In addition, Data driven is not restricted to a single report. You can easily run 2 or more reports per record, you can run a report, export it to PDF, and append existing PDF files (for example a flyer, welcome letter, etc.). With data driven reports you have real flexibility. Bursting may be able to handle the basic scenarios, but is more limited than other options.
The bursting job will retrieve and load all the data at once, while a data driven job will work with small chunks of data. The data driven job will have lower hardware requirements.