KIOFuse – Final Report

The GSoC coding period is now over and it is only appropriate that it is discussed what has been achieved and what needs to be done to see KIOFuse officially included in as a KDE project, allowing the 75324 bug report to be finally closed after a whole 15 years! Before I continue, I’d like to thank my mentors, Fabian Vogt (fvogt) and Chinmoy Ranjan Pradhan (chinmoyr) for all their support and advice during the course of GSoC. I’d also like to thank various reviewers of upstream code who quickly reviewed and merged code that I submitted. My previous posts (here and here) have discussed the work accomplished in May/June in detail.

Currently the way KIOFuse works is that I/O is implemented on top of a file-based cache, in particular, on temporary files. Reading and writing occurs on the temp file. Flushing works by calling KIO::put, which sends the data in our cache to the remote side via a TransferJob. However, whilst this is happening, there’s nothing stopping write requests coming in for that same node, which marks the cache as dirty. Once the job is done, we check if the node is dirty. If it is, we start another TransferJob, as it would be incorrect to say that the node is flushed. If this scenario keeps occurring, we’d never reply with a successful flush. A simple solution, which doesn’t guarantee that this scenario doesn’t occur but can decrease its likelihood, is as follows: every time a chunk of data is requested by the TransferJob we check if the node is dirty. If so, if it is less than 85% complete, restart the job, otherwise let it finish. The patch for this can be found here.

Another task was to refresh the attributes of nodes after a while. Currently, the existence of nodes is only checked lazily, i.e. if lookup or readdir are called. For each new node found (or created) the stat struct (the node’s attributes) is filled with the values from KIO::UDSEntry. However, this is only done once and any changes on the remote side are not noticed. One could always refresh the attributes on every lookup but that may be overzealous, and so the solution chosen was to do a KIO::listDirin readdir if it hasn’t been called on that node in the last 30 seconds. The patch for this can be found here.

Another problem that KIOFuse had was that write permission wasn’t checked, we’d just forward the permission bits received from the remote side and if we in fact couldn’t write we’d only know during flush, which is a bit too late. Although we cannot fully guarantee write access, there are steps taken to try and get as close as possible to a guarantee. First we start a KIO::chown job, which changes the owner to ourself. If we can, then we assume that the owner permission bits are valid, and forward them. If KIO::chown isn’t supported we just allow write requests to go through, as there is simply no way we can actually check. If the job fails (i.e. we’re not the owner of the file), then changing the modification time can actually help us; it doesn’t help us if we’re the owner as we can change the modification time independent of our write permission. If we can change it, then we can write, if KIO::setModificationTime isn’t supported, we just allow write requests to go through. The patch for this can be found here.

As mentioned previously, file I/O is implemented on top of a file-based cache. Data on the remote side is transferred via the help of KIO::get and KIO::put. Some slaves support KIO::open, in particular sftp/smb/file, which means that it is possible to engage in seek-based file I/O. This means that we do not need to use a file-based cache for those slaves, and can simply read and write directly from and to the remote side. This obviously introduces a bit more latency, but also means that we can easily handle large files, such as videos. The biggest issue with getting this implemented is that the documentation on the how to implement KIO::open correctly and its assorted functions consists of one-liners. This means that each slave author has their own idea of what it means to read/write/seek. The solution to this was to study what all three slaves did, and converge on one definition of what we mean by the different functions provided by the FileJob interface. In addition several bugs were squashed. The most surprising of which was the close signal never being emitted; a big sign that no one has used this API properly since its inception in 2006! Issues in the smb/sftp slaves and the mtp slaves were also fixed. The above mentioned patches have all been merged meaning that KIOFuse now requires KF5 Frameworks 5.62 (and kio-extras 19.08.1). The patch that allows KIOFuse to take advantage of KIO::open can be found here.

The main benefit of KIOFuse is obviously integration into KIO itself. The idea is to create a KIOFuse KDED module loaded at startup, which starts the kio-fuse process. On the KIO side, every time we wish to send a KIO URL to an app that we identify as not supporting the given URL, a DBus request is sent to our KDED module, which mounts the URL and sends back the local path of the URL. We then pass it to the application. The beauty of this is that the conversion is transparent and requires no setup from the user; it also induces no slowdown to KIO-enabled applications. In fact, many people won’t even be able to tell that they’re using KIOFuse, non-KDE apps will seamlessly access the KIOFuse path instead of KIO URLs they don’t understand. The patch for the KDED module can be found here, and the patch in KIO that uses that KDED module can be found here.

So, what’s required for KIOFuse to be production ready? Firstly, some of the linked MRs have not been merged, which just requires a bit of time to get it reviewed and in. Secondly, there are still some bugs that need resolving. GDrive files which don’t have a size (usually GDoc files) get corrupted on read (and potentially write), this issue has been looked at but I’ve not yet found a resolution. MTP doesn’t seem to work at all for some reason. Whether this is an issue in the MTP slave or KIOFuse has not been determined, which is making it harder to resolve this issue. Another thorny issue is that conversion from a local path to a remote URL is a bit buggy. This should be resolvable, but it needs to time to get it right. Ideally, we’d like more testing on all slaves. One potential cool way of testing KIOFuse is using fio, which performs several intensive tests, usually for kernel file systems; this is something we definitely should explore. There is also the question of how KIOFuse will be included in KDE, for example, will it be a framework?

Note that I’m at Akademy, so feel free to ask any questions about it either there or on this blog.

6 thoughts on “KIOFuse – Final Report

  1. Thanks so much for this very valuable work, Alexander! With the conclusion of GSoC, where can people follow the continuing work to integrate the code into the KDE codebase? Is there a Phab task?

    Like

    1. Yeah there’s a phab task. Although its currently not helpful atm. gitlab.com/Vogtinator/kio-fuse is where all the MRs are located, apart from the phabricator one regarding KDED modules (mentioned in the blog post).

      Like

Leave a comment