KIOFuse – Final Report

The GSoC coding period is now over and it is only appropriate that it is discussed what has been achieved and what needs to be done to see KIOFuse officially included in as a KDE project, allowing the 75324 bug report to be finally closed after a whole 15 years! Before I continue, I’d like to thank my mentors, Fabian Vogt (fvogt) and Chinmoy Ranjan Pradhan (chinmoyr) for all their support and advice during the course of GSoC. I’d also like to thank various reviewers of upstream code who quickly reviewed and merged code that I submitted. My previous posts (here and here) have discussed the work accomplished in May/June in detail.

Currently the way KIOFuse works is that I/O is implemented on top of a file-based cache, in particular, on temporary files. Reading and writing occurs on the temp file. Flushing works by calling KIO::put, which sends the data in our cache to the remote side via a TransferJob. However, whilst this is happening, there’s nothing stopping write requests coming in for that same node, which marks the cache as dirty. Once the job is done, we check if the node is dirty. If it is, we start another TransferJob, as it would be incorrect to say that the node is flushed. If this scenario keeps occurring, we’d never reply with a successful flush. A simple solution, which doesn’t guarantee that this scenario doesn’t occur but can decrease its likelihood, is as follows: every time a chunk of data is requested by the TransferJob we check if the node is dirty. If so, if it is less than 85% complete, restart the job, otherwise let it finish. The patch for this can be found here.

Another task was to refresh the attributes of nodes after a while. Currently, the existence of nodes is only checked lazily, i.e. if lookup or readdir are called. For each new node found (or created) the stat struct (the node’s attributes) is filled with the values from KIO::UDSEntry. However, this is only done once and any changes on the remote side are not noticed. One could always refresh the attributes on every lookup but that may be overzealous, and so the solution chosen was to do a KIO::listDirin readdir if it hasn’t been called on that node in the last 30 seconds. The patch for this can be found here.

Another problem that KIOFuse had was that write permission wasn’t checked, we’d just forward the permission bits received from the remote side and if we in fact couldn’t write we’d only know during flush, which is a bit too late. Although we cannot fully guarantee write access, there are steps taken to try and get as close as possible to a guarantee. First we start a KIO::chown job, which changes the owner to ourself. If we can, then we assume that the owner permission bits are valid, and forward them. If KIO::chown isn’t supported we just allow write requests to go through, as there is simply no way we can actually check. If the job fails (i.e. we’re not the owner of the file), then changing the modification time can actually help us; it doesn’t help us if we’re the owner as we can change the modification time independent of our write permission. If we can change it, then we can write, if KIO::setModificationTime isn’t supported, we just allow write requests to go through. The patch for this can be found here.

As mentioned previously, file I/O is implemented on top of a file-based cache. Data on the remote side is transferred via the help of KIO::get and KIO::put. Some slaves support KIO::open, in particular sftp/smb/file, which means that it is possible to engage in seek-based file I/O. This means that we do not need to use a file-based cache for those slaves, and can simply read and write directly from and to the remote side. This obviously introduces a bit more latency, but also means that we can easily handle large files, such as videos. The biggest issue with getting this implemented is that the documentation on the how to implement KIO::open correctly and its assorted functions consists of one-liners. This means that each slave author has their own idea of what it means to read/write/seek. The solution to this was to study what all three slaves did, and converge on one definition of what we mean by the different functions provided by the FileJob interface. In addition several bugs were squashed. The most surprising of which was the close signal never being emitted; a big sign that no one has used this API properly since its inception in 2006! Issues in the smb/sftp slaves and the mtp slaves were also fixed. The above mentioned patches have all been merged meaning that KIOFuse now requires KF5 Frameworks 5.62 (and kio-extras 19.08.1). The patch that allows KIOFuse to take advantage of KIO::open can be found here.

The main benefit of KIOFuse is obviously integration into KIO itself. The idea is to create a KIOFuse KDED module loaded at startup, which starts the kio-fuse process. On the KIO side, every time we wish to send a KIO URL to an app that we identify as not supporting the given URL, a DBus request is sent to our KDED module, which mounts the URL and sends back the local path of the URL. We then pass it to the application. The beauty of this is that the conversion is transparent and requires no setup from the user; it also induces no slowdown to KIO-enabled applications. In fact, many people won’t even be able to tell that they’re using KIOFuse, non-KDE apps will seamlessly access the KIOFuse path instead of KIO URLs they don’t understand. The patch for the KDED module can be found here, and the patch in KIO that uses that KDED module can be found here.

So, what’s required for KIOFuse to be production ready? Firstly, some of the linked MRs have not been merged, which just requires a bit of time to get it reviewed and in. Secondly, there are still some bugs that need resolving. GDrive files which don’t have a size (usually GDoc files) get corrupted on read (and potentially write), this issue has been looked at but I’ve not yet found a resolution. MTP doesn’t seem to work at all for some reason. Whether this is an issue in the MTP slave or KIOFuse has not been determined, which is making it harder to resolve this issue. Another thorny issue is that conversion from a local path to a remote URL is a bit buggy. This should be resolvable, but it needs to time to get it right. Ideally, we’d like more testing on all slaves. One potential cool way of testing KIOFuse is using fio, which performs several intensive tests, usually for kernel file systems; this is something we definitely should explore. There is also the question of how KIOFuse will be included in KDE, for example, will it be a framework?

Note that I’m at Akademy, so feel free to ask any questions about it either there or on this blog.

Advertisements

KIOFuse: June in Review

The coding period has now extended over a month and quite a few improvements have been merged into KIOFuse. In my last post I mentioned the development of a KIO error to FUSE error mapping and 32 bit support.

However, interestingly enough it took quite a long time for the 32-bit support branch to be merged. This was because of a test that didn’t fail nor pass – it froze. The test suite would never finish and the process would only respond to SIGKILL. After days of debugging it was determined that fuse_notify_inval_* functions don’t play well when writeback caching is enabled and hence there is now a patch to disable it. Of course this will incur a performance hit as writes will go straight to KIOFuse, and hence straight to disk (although the kernel may cache our write requests to our own cache). Whilst this is unfortunate, seeming as most KIO slaves are network based, switching from a writeback caching policy to a writethrough one is unlikely to hamper performance too much.

In other news, KIOFuse can now handle SIGTERM, SIGINT and SIGHUP signals. Signal handlers can only call async-signal-safe functions. However in Qt there is a bit of hack one can perform, as inspired by this tutorial. Hence, in response to these signals, KIOFuse will flush all dirty nodes to disk, meaning no sudden data loss!

Mounts can now have their password changed.

The lookup function has now been optimised. Previously a lookup would call KIO::listDir, which was totally unnecessary – a KIO::stat would suffice and this is what the patch has switched to. It also increased the data buffer from 1MB to 14MB.

Unmount support has currently been postponed. It is proving problematic to implement reliably and unmounting only really provides a marginal benefit, so it is yet to be seen if we will implement it at all. The current WIP patch can be found here.

It has been decided that slaves that do not implement KIO::stat will not be supported. It’s a bit of a hassle to implement with extremely marginal benefit. There are only a few slaves that don’t implement KIO::stat, such as fonts and thumbnails.

An issue with KDE Connect not working properly has been fixed upstream. I haven’t 100% confirmed which patch has fixed this for us, but I’m placing my bets on this one.

Currently, the Google Drive API reports a size of zero for files that are not supported by GDrive, such as odt files , and their proprietary formats – i.e. Google Docs. Whilst we can update the size quite easily by downloading the file, the file turns out corrupted, and is only openable if the program has a repair option (such as LibreOffice). Unfortunately, I’ve not been able to find out why exactly this is happening, and have not come up with a fix. Currently this is being shelved and I hope to revisit it at a later date with a fresher mind.

We’d welcome anyone to use and test KIOFuse. Feel free to notify us on any bugs or performance issues by opening an issue and you can even contribute a patch if you wish!

KIOFuse: 32-bit Support

The first two weeks of the GSoC coding period are now over.

Firstly, a mapping between KIO errors and FUSE errors has now been established. Previously all KIO Job errors were simply sent back to FUSE as EIO, which isn’t entirely accurate. The mapping now provides more accurate error replies.

A major new addition is 32-bit support. KIOFuse did not compile on 32-bit but these compilation errors have now been alleviated. They mostly stemmed from the fact that size_t has a different size on different architectures, and that file sizes should always be represented as off_t anyway.

Another big question was whether files larger than 4GiB could be sent without data corruption. A switch from the C standard I/O functions to UNIX I/O functions was necessary to avoid data corruption. However, this alone was not sufficient. On 32-bit files larger than 4GiB I noticed that no write call was occurring, although it occurred on 64-bit and on 32-bit with smaller files. I couldn’t point this down to anything in KIOFuse (seeming as KIOFuse simply responds to requests passed by FUSE, what can we do if we don’t receive the necessary request?). Luckily one of my mentors, fvogt, spotted a kernel bug report, which fortunately was fixed about a month ago. This bug was the cause of the problem I was seeing. This means that transferring of files larger than 4GiB via KIOFuse is now supported on distros running kernels 5.1.5+ only. Because of this, we do not recommend packaging or using KIOFuse on kernels less than the mentioned version, due to likely data corruption when transferring large files. This patch is still not in master and is under active review. It is being looked into whether adding a unit test is feasible.

I’d like to thank my mentors, fvogt and chinmoy for their help and guidance so far and look forward to improving KIOFuse further. Over the next week, signal handling and password change/unmount (if necessary) support are the features that will be worked on.

KIOFuse – GSoC 2019

It’s been a great pleasure to be chosen to work with KDE during GSoC this year. I’ll be working on KIOFuse and hopefully by the end of the coding period it will be well integrated with KIO itself. Development will mainly by coordinated on the #kde-fm channel (IRC Nick: feverfew) with fortnightly updates on my blog so feel free to pop by! Here’s a small snippet of my proposal to give everyone an idea of what I’ll be working on:

KIOSlaves are a powerful feature within the KIO framework, allowing KIO-aware applications
such as Dolphin to interact with services out of the local filesystem over URLs such as fish://
and gdrive:/. However, KIO-unaware applications are unable to interact seamlessly with KIO
Slaves. For example, editing a file in gdrive:/ in LibreOffice will not save changes to your Google Drive. One potential solution is to make use of FUSE, which is an interface provided
by the Linux kernel, which allows userspace processes to provide a filesystem which can be
mounted and accessed by regular applications. ​KIOFuse is a project by fvogt that
allows the possibility to mount KIO filesystems in the local system; therefore exposing them to
POSIX-compliant applications such as Firefox and LibreOffice.

This project intends to polish KIOFuse such that it is ready to be a KDE project. In particular,
I’ll be focusing on the following four broad goals:
• ​Improving compatibility with KDE and non-KDE applications by extending and improving
supported filesystem operations.
• ​Improving KIO Slave support.
• ​Performance and usability improvements.
• ​Adding a KDE Daemon module to allow the management of KIOFuse mounts and the
translation of KIO URLs to their local path equivalents.