BrightScript Profiler is a tool created for collecting and analyzing channel metrics that can be used to determine where performance improvements and efficiencies can be made in the channel.
Table of Contents
The BrightScript profiler provides the following metrics for a channel:
- CPU Usage: Determine where BrightScript code execution is happening
- Wall-Clock time: Determine where the most time (execution and waiting) is being spent in a channel
- Function call counts
- Memory usage, including memory leak detection – Available starting firmware version 9
Each of the above metrics can be used to diagnose problems and provide guidance to the channel developer to improve channel performance.
The workflow of the BrightScript Profiler is as below:
- Add at least the required manifest entries to the channel to run the profiler
- Run and then Exit the channel to generate data and metrics
- Retrieve the profiling data from the device locally or over a network
- Analyze the profiling data as necessary
Below is the list of manifest keys used by the profiler:
If this entry value is
If this entry value is
|boolean||0||Yes||Turns on BrightScript Profiling when the channel is run. This is a master flag and must be set to 1 for any other profiling options to take effect.|
|boolean||0||If using memory profiling|
Turns on memory profiling. Only has effect if
If this is enabled,
Immediately upon starting the channel, profiling is paused until manually resumed with the
This is useful for profiling isolated parts of a channel's UI or operations, rather than profiling the entire startup sequence of the channel.
Sets how often profiling samples are taken, while the channel is running. Only has effect if bsprof_enable=1
Asampleratio of 1.0 causes every BrightScript statement to be measured and integrated into the profiling data for the channel. Unfortunately, a sample ratio of 1.0 may cause some complex channels to run very slowly when profiling is enabled, making them difficult to test. Choosing a lower sample ratio for such channels can make them more usable, while profiling is enabled. A higher ratio will yield more accurate data, so the ratio should be set at 1.0 whenever possible.
bs_prof_sample_ratio can be adjusted from
1.0. A sample ratio of
1.0 is the default and will measure every BrightScript statement. A sample ratio of
1.0 will have some performance impact, but in most cases, it won’t affect the usability of your channel and will provide the most accurate data. However, if your channel is overly sluggish with a ratio of
1.0, you can reduce the ratio to reduce the profiler’s overhead. A lower sample ratio will provide less accurate data, so it’s recommended that you use the highest ratio that still allows your channel to be usable.
Running the profiler on a channel
To initiate the memory profiler, sideload, run, and then exit the channel. The profiling data is complete only after the channel exits. Note that memory data can be streamed using over a network. The advantage to streaming the data over the network is that it consumes significantly less memory on the device while the channel is running.
Pausing and resuming profiling
Channel profiling can be paused and resumed at any time. Use the following two commands on port 8080 to either pause or resume the memory profiler:
If the profiler is paused, very little data is written regardless of the data destination. This allows profiling data (generally, the data relevant to specific parts of a channel's UI or other operation) to be collected and later analyzed. These two commands are particularly useful when combined with the
bsprof_pause_on_start manifest entry.
For example, if starting video playback is slow or seems to cause memory leaks, the
bsprof_pause_on_start=1 entry can be set in the channel's manifest. After the channel is launched, but prior to video playback, execute the
bsprof-resume command on port 8080, to begin collecting profiling data. After performing the UI operations to be profiled, execute the
bsprof-pause command to suspend the storing of profiling data. Then, exit the channel to make the profiling data available for analysis. In this scenario, the profiling data will pertain specifically to the operations performed between
Port 8080 Commands
These profiling commands exist on port 8080 (Roku OS Versions 9 and later):
|Get the status of BrightScript profiling|
|Pause the generation of profiling data|
|Resume the generation of profiling data|
Collecting the data
The channel's manifest entry
bsprof_data_dest determines how the profiling data is retrieved from the device. The data can be stored locally on the device and downloaded after the channel has exited, or it can be streamed over a network connection while the channel is running.
Data Destination: Local
Local data storage is the default mode of operation, though it can be explicitly selected by adding
bsprof_data_dest=local to the channel's manifest. When operating in this mode the data becomes available in the device's Application Installer after the channel has exited:
- Launch the channel as you normally would and run through your test cases. Once you exit the channel, open the page to your Roku device's Developer Settings and click on Utilities.
After you have run through your channel's test cases, click Profiling Data to generate a
.bsproffile and a link to download the data from your Roku device.The
.bsprofformat is unique to Roku to ensure the format is as efficient and small as possible and easy to generate even on low-end Roku devices.
Data Destination: Network
Available starting firmware version 9
In order to stream a channel's profiling data to the network while the channel is running, add
bsprof_data_dest=network to the channel's manifest. Streaming data over the network is especially useful when profiling a channel's memory usage because all memory operations are included in the profiling data, and the amount of space necessary to store the data can be very large. By streaming the data to the network, the data size is limited primarily by the host computer receiving the data, and not by the available memory on the device itself. Even while streaming the profiling data to the network, there are still additional demands placed on the device's resource while profiling compared to running a channel without profiling. However, the use of resources on the device is significantly reduced.
When this feature is enabled, the start of the channel is delayed until a network connection is received by the device, which will be the destination for the data. When the channel is launched, a message similar to this will appear on the port 8085 developer console:
08-31 23:15:29.542 [scrpt.prof.connect.wait] Waiting for connection to http://192.168.1.1:8090/bsprof/channel.bsprof
The URL should be used with wget, curl, or a web browser. Once a connection is received from one of those programs, this message will appear on the developer console:
08-31 23:15:38.939 [scrpt.prof.connect.ok] profiler connected
When the channel is exited, this message will appear on the developer console:
08-31 23:16:04.774 [scrpt.prof.save.ok] Profiling data complete, sent via network
Once that message is seen, the profiling connection is closed by the device and the remote file is complete.
Processing the data
After downloading the
.bsprof file, the data can be viewed using the BrightScript Profiler Visualization Tool.
BrightScript Profiler Visualization Tool - CPU output view
BrightScript Profiler Visualization Tool - Memory output view
Understanding the data
The profiling data is divided into 5 main sections:
- The function (and associated call path which can be expanded),
- CPU time,
- wall-clock time,
- function call counts,
- memory profiler output
The CPU time and wall-clock time sections are further divided into separate sections for
selfrefers to the CPU/wall-clock time the function consumes itself
calleesis the amount of time consumed by any functions called by the original function.
totalis the amount of time consumed by the original function (
self) and any
Function call paths
This section of the profiling data contains the function calls in each thread. For SceneGraph applications, each thread corresponds to either the main BrightScript thread or a single instance of a
For example, if you have a Task node that is instantiated multiple times, each instance will appear as a separate thread. The results will be the same for any custom
<component> in your channel that is instantiated multiple times. The main BrightScript thread (
Thread main) is also represented as a single thread even though it has no
The first 3 columns of the visualization tool lists:
- the time consumed by the CPU process (
- any other functions that are called (
- and the total amount of time consumed (
The next 3 columns lists:
- the amount of "wall-clock" time for the function,
- its callees,
- and the total.
Function call counts
As the name implies, this column lists the number of times they were called when the channel ran with profiling enabled.
Values from memory profiling
Number of times this function was called
CPU* used in this function, itself
CPU* used in functions called by this function
Cpu.self + cpu.callees
Memory allocated within this function itself, but not freed (leaks)
Memory allocated by functions called by this function, but not freed (leaks)
Mem.self + mem.callees
Real (wall-clock) time** spent in this function, itself
Real (wall-clock) time** spent in functions called by this function
Tm.self + tm.callees
Average of the metric, over the number of calls (e.g., if cpu.self=100 and calls=2, avg_cpu_self will be 50).
A “memory leak” is simply any memory that is allocated, but not freed while the profiler was running. If memory is freed while profiling is paused, the free will not be tracked and the memory may show up as “leaked.”
Time is measured as if a stopwatch were used to time the action. For example, if a function makes a network call, there may be very little CPU time used, but a significant amount of time waiting for the network response.
If any of these metrics appear in a call path, there are specific to that call path. For example, in this call path:
The metrics for func2() are specific to when it is called from func1().
However, in this table:
The metrics displayed are totals for all calls to each function, on any call path.
Using this data
Here are a few key points on how to use this data to improve channel performance:
|Data Type||Definition and Best Use|
|High wall-clock time but low CPU time||This pattern shows a function is consistently waiting, whether it be for input or a response from an external source. These functions are best suited for Task nodes so that it doesn't block the main thread.|
|Complex functions||Try to simplify functions as much as possible. If a function handles multiple tasks, consider breaking it out into several functions to further isolate how much CPU or wall-clock time is consumed by each task.|
|Functions that consume a large amount of CPU or wall-clock time||Try to reduce the number of calls to these functions as much as possible. Move functions to Task nodes, if they are consistently waiting. A function can be determined to be waiting if it's wall-clock time is high, but its CPU cost is low.|