Issue
We have a Linux system using kernel 3.14.17, PREEMPT RT. It is a single core system.
For latency issues, our application has some of its threads' scheduling type set to SCHED_RR. However, this causes the kworkers in the kernel to be blocked, as they are only running in mode SCHED_OTHER. This can cause a kind of priority inversion, as a low priority SCHED_RR thread can block a higher priority SHED_RR from receiving the data from the driver.
It is the TTY driver that is being blocked. It uses a work queue in the function tty_flip_buffer_push. Possibly more calls, but that is one we've identified.
Is there any way to easily fix this problem - a RT application being dependent on a kworker? We are hoping we don't have to hack the driver/kernel ourselves. Are there any kernel config options in the RT kernel for this kind of stuff? Can we,
- set a SCHED_RR priority for the kworkers?
- disable work queues for a specific driver?
If we'd have to hack the driver, we'd probably give it its own work queue, with a SCHED_RR kworker.
Of course, any other solution is also of interest. We can upgrade to a later kernel version if there is some new feature.
Solution
The root-cause for this behaviour is tty_flip_buffer_push()
In kernel/drivers/tty/tty_buffer.c:518
,
tty_flip_buffer_push schedules an asynchronous task. This is soon executed asynchronously by a kworker thread.
However, if any realtime threads execute on the system and keep it busy then the chance that the kworker thread will execute soon is very less. Eventually once the RT threads relinquish CPU or RT-throttling is triggerred, it might eventually provide the kworker thread a chance to execute.
Older kernels support the low_latency
flag within the TTY sub-system.
Prior to Linux kernel v3.15 tty_flip_buffer_push()
honored the low_latency
flag of the tty port.
If the low_latency
flag was set by the UART driver as follows (typically in its .startup()
function),
t->uport.state->port.tty->low_latency = 1;
then tty_flip_buffer_push()
perform a synchronous copy in the context of the current function call itself. Thus it automatically inherits the priority of the current task i.e. there is no chance of a priority inversion incurred by asynchronously scheduling a work task.
Note: If the serial driver sets the
low_latency
flag, it must avoid callingtty_flip_buffer_push()
within an ISR(interrupt context). With thelow_latency
flag set,tty_flip_buffer_push()
does NOT use separate workqueue, but directly calls the functions. So if called within an interrupt context, the ISR will take longer to execute. This will increase latency of other parts of the kernel/system. Also under certain conditions (dpeending on how much data is available in the serial buffer)tty_flip_buffer_push()
may attempt to sleep (acquire a mutex). Calling sleep within an ISR in the Linux kernel causes a kernel bug.
With the workqueue implementation within the Linux kernel having migrated to CMWQ,
it is no longer possible to deterministically obtain independent execution contexts
(i.e. separate threads) for individual workqueues.
All workqueues in the system are backed by kworker/*
threads in the system.
NOTE: THIS SECTION IS OBSOLETE!!
Leaving the following intact as a reference for older versions of the Linux kernel.
Customisations for low-latency/real-time UART/TTY:
1. Create and use a personal workqueue for the TTY layer.
Create a new workqueue in tty_init().
A workqueue created with create_workqueue()
will have 1 worker thread for each CPU on the system.
struct workqueue_struct *create_workqueue(const char *name);
Using create_singlethread_workqueue()
instead, creates a workqueue with a single kworker process
struct workqueue_struct *create_singlethread_workqueue(const char *name);
2. Use the private workqueue.
Queue the flip buffer work on the above private workqueue instead of the kernel's global global workqueue.
int queue_work(struct workqueue_struct *queue, struct work_struct *work);
Replace schedule_work()
with queue_work()
in functions called by tty_flip_buffer_push().
3. Tweak the execution priority of the private workqueue.
Upon boot the kworker thread being used by TTY layer workqueue can be identified by the string name
used while creating it. Set an appropriate higher RT priority using chrt
upon this thread as required by the system design.
Answered By - TheCodeArtist Answer Checked By - Senaida (WPSolving Volunteer)