[3/3] POHMELFS: core files.

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: <linux-kernel@...>
Cc: <netdev@...>, <linux-fsdevel@...>
Date: Monday, July 7, 2008 - 2:11 pm

POHMELFS core.

Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>

diff --git a/Documentation/filesystems/pohmelfs/design_notes.txt b/Documentation/filesystems/pohmelfs/design_notes.txt
new file mode 100644
index 0000000..0b20a47
--- /dev/null
+++ b/Documentation/filesystems/pohmelfs/design_notes.txt
@@ -0,0 +1,70 @@
+POHMELFS: Parallel Optimized Host Message Exchange Layered File System.
+
+		Evgeniy Polyakov <johnpol@2ka.mipt.ru>
+
+Homepage: http://tservice.net.ru/~s0mbre/old/?section=projects&item=pohmelfs
+
+It was first started as network filesystem with coherent local data and metadata caches,
+but it is being evolved into parallel distibuted filesystem now.
+
+Main features of this FS include:
+ * Local coherent (notes cache for data and metadata:
+ 	http://tservice.net.ru/~s0mbre/blog/devel/fs/2008_05_17.html)
+ * Completely async processing of all events (hard, symlinks and rename are the
+ 	only exceptions) including object creation and data reading and writing.
+ * Flexible object architecture optimized for network processing.
+ 	Ability to create long pathes to object and remove arbitrary huge
+	directoris in single network command.
+	(like removing the whole kernel tree via single network command).
+ * Very high performance.
+ * Fast and scalable multithreaded userspace server. Being in userspace it works
+ 	with any underlying filesystem and still is much faster than async in-kernel NFS one.
+ * Client is able to switch between different servers (if one goes down, client
+ 	automatically reconnects to second and so on).
+ * Transactions support. Full failover for all operations.
+ 	Resending transactions to different servers on timeout or error.
+ * Read requests (data read, directory listing, lookup requests) balancing between multiple servers.
+ * Write requests are sent to multiple servers and completed only when all of them sent an ack.
+ * Ability to add and/or remove servers from working set at run-time from userspace (via netlink,
+ 	so the same command can be processed from real network though, but since server does
+	not support it yet, I dropped network part).
+
+
+POHMELFS is based on transactions, which are potentially long-standing objects, which live
+in clients memory. Each transaction contains all information needed to process given command
+(or set of command, which is frequently used during data writing: single transaction can contain
+creation and data writing commands). Transaction is committed by all servers, where it was sent,
+and in case of failure of one or another, it will be eventually resent or dropped with error,
+so, for example, reading will return error, if no servers are available.
+
+POHMELFS uses novel asynchronous approach of data processing. Courtesy to transactions, it is
+possible to detouch reply from request, and if command requires data to be received, caller
+just sleeps waiting for it. Thus it is possible to issue multiple read commands to different
+servers and async threads will pick replies in parallel, find appropriate transactions in the
+system and put data where it belongs (like page or inode cache).
+
+Main feature of the POHMELFS is writeback data and metadata cache.
+Only few non-performance critical operations use write-through cache and are synchronous:
+hard and symbolic link creation and object rename. Creation and removal of objects,
+as long as writing, are asynchronous and are sent to the server during system writeback.
+When server receives some request for given object in the system (like data reading,
+or file creation or whatever else), it stores appropriate client information in own cache,
+so when subsequent request comes from different client, all previous could be notified
+(for example when several clients read data from file, and then new client writes there,
+appropriate pages on clients will be invalidated, so subsequent write will force them
+to read page from the server). Because of this feature POHMELFS is extremely fast in metadata
+intensive workloads, and can fully utilize bandwidth to servers when doing bulk data transafers.
+
+POHMELFS client operates with working set of servers, and is capable of balancing reading,
+i.e. no modification, like lookup or directory listing, workload between all servers it was conneted to.
+Administrator can add or remove servers from the set in run-time via special command (described
+in Documentation/pohmelfs/info.txt file). Writing is performed to all servers.
+
+POHEMLFS is capable of full data channel encryption and/or strong crypto hashing.
+One can select kernel supported cipher, encryption mode, hash type and operation mode
+(hmac or digest). It is also possible to use both or neither (default). Crypto configuration
+is checked during mount time, and if server does not support it, appropriate capabilities
+will be disabled or mount will fail (if 'crypto_fail_unsupported' mount option is specified).
+Crypto performance heavily depends on number of crypto threads, which asynchronously perform
+crypto operations and send data to server or submit it higher to the stack. Number of crypto
+threads is controlled via mount option.
diff --git a/Documentation/filesystems/pohmelfs/info.txt b/Documentation/filesystems/pohmelfs/info.txt
new file mode 100644
index 0000000..f96f09d
--- /dev/null
+++ b/Documentation/filesystems/pohmelfs/info.txt
@@ -0,0 +1,81 @@
+POHMELFS usage information.
+
+Mount options:
+idx=%u
+ Each mountpoint is associated with special index via this option.
+ Administrator can add or remove servers from given index, so all mounts,
+ which were attached to it, were updated.
+ Default it is 0.
+
+trans_scan_timeout=%u
+ This timeout, expressed in milliseconds, specifies time to scan trasaction
+ trees looking for stale requests, which have to be resent, or if number of
+ retries exceed specified limit, dropped with error.
+ Default is 5 seconds.
+
+drop_scan_timeout=%u
+ Internal timeout, expressed in milliseconds, which specifies how frequently
+ inodes marked to be dropped are freed. It also specifies how frequently
+ system checks, that servers has to be added or removed from current working set.
+ Default is 1 second.
+
+wait_on_page_timeout=%u
+ Number of milliseconds to wait for reply from remote server for data reading command.
+ If this timeout is exceeded, reading returns error.
+ Default is 5 seconds.
+
+trans_retries=%u
+ Number of times, transaction will be resent to the server, which did not answer for the
+ last @trans_scan_timeout milliseconds. When number of resends exceeds this limit,
+ transaction is completed with error.
+ Default is 5 resends.
+
+crypto_thread_num=%u
+ Number of crypto processing threads. Threads are used both for RX and TX traffic.
+ Default is 2, or no threads if crypto operations are not supported.
+
+trans_max_pages=%u
+ Maximum number of pages in single transaction. This parameter also control number of pages,
+ allocated for crypto processing (each crypto thread has pool of pages, number of which is
+ equal to 'trans_max_pages'.
+ Default is 100 pages.
+
+crypto_fail_unsupported
+ If specified, mount will fail if server does not support requested crypto operations.
+ By default mount will disable non-matching crypto operations.
+
+Usage examples.
+
+Add (or remove if it already exists) server server1.net:1025 into working set with index $idx
+with appropriate hash algorithm and key file and cipher algorithm, mode and key file:
+$cfg -a server1.net -p 1025 -i $idx -K $hash_key -H "hmac(sha1)" -C "cbc(aes)" -k $cipher_key
+
+Mount filesystem with given index $idx to /mnt mountpoint.
+Client will connect to all servers specified in working set via previous command:
+mount -t pohmel -o idx=$idx q /mnt
+
+One can add or remove servers from working set after mounting too.
+
+
+Server installation.
+
+Creating a server, which listens at port 1025 and 0.0.0.0 address.
+Working root directory (note, that server chroots there, so you have to have appropriate permissions)
+is set to /mnt, server supports SHA1 hasing (with 'hash_key' key file) and AES-CBC encryption
+(with 'cipher_key' key file, its size specifies cipher blocksize)
+Number of working threads is set to 10.
+
+# ./fserver -a 0.0.0.0 -p 1025 -r /mnt -w 10 -K hash_key -H "hmac(sha1)" -C "cbc(aes)" -k cipher_key
+
+ -r root                 - path to root directory. Default: /tmp.
+ -a addr                 - listen address. Default: 0.0.0.0.
+ -p port                 - listen port. Default: 1025.
+ -w workers              - number of workers per connected client. Default: 1.
+ -K file		 - hash key size. Default: none.
+ -H hash		 - hash type. Default: none.
+ -k file		 - cipher key size. Default: none.
+ -C hash		 - cipher type. Crypto mode should be specified as 'mode(cipher)'. Default: none.
+ -h                      - this help.
+
+Number of worker threads specifies how many workers will be created for each client.
+Bulk single-client transafers usually are better handled with smaller number (like 1-3).
diff --git a/Documentation/filesystems/pohmelfs/network_protocol.txt b/Documentation/filesystems/pohmelfs/network_protocol.txt
new file mode 100644
index 0000000..bc13058
--- /dev/null
+++ b/Documentation/filesystems/pohmelfs/network_protocol.txt
@@ -0,0 +1,220 @@
+POHMELFS network protocol.
+
+Basic structure used in network communication is following command:
+
+struct netfs_cmd
+{
+	__u16			cmd;	/* Command number */
+	__u16			csize;	/* Attached crypto information size */
+	__u16			cpad;	/* Attached padding size */
+	__u16			ext;	/* External flags */
+	__u32			size;	/* Size of the attached data */
+	__u32			trans;	/* Transaction id */
+	__u64			id;	/* Object ID to operate on. Used for feedback.*/
+	__u64			start;	/* Start of the object. */
+	__u64			iv;	/* IV sequence */
+	__u8			data[0];
+};
+
+Commands can be embedded into transaction command (which in turn has own command),
+so one can extend protocol as needed without breaking backward compatibility as long
+as old commands are supported. All string lengths include tail 0 byte.
+
+@cmd - command number, which specifies command to be processed. Following
+	commands are used currently:
+
+	NETFS_READDIR	= 1,	/* Read directory for given inode number */
+	NETFS_READ_PAGE,	/* Read data page from the server */
+	NETFS_WRITE_PAGE,	/* Write data page to the server */
+	NETFS_CREATE,		/* Create directory entry */
+	NETFS_REMOVE,		/* Remove directory entry */
+	NETFS_LOOKUP,		/* Lookup single object */
+	NETFS_LINK,		/* Create a link */
+	NETFS_TRANS,		/* Transaction */
+	NETFS_OPEN,		/* Open intent */
+	NETFS_INODE_INFO,	/* Metadata cache coherency synchronization message */
+	NETFS_JOIN_GROUP,	/* Joing metadata update group */
+	NETFS_LEAVE_GROUP,	/* Leave metadata update group */
+	NETFS_PAGE_CACHE,	/* Page cache invalidation message */
+	NETFS_READ_PAGES,	/* Read multiple contiguous pages in one go */
+	NETFS_RENAME,		/* Rename object */
+	NETFS_CAPABILITIES,	/* Capabilities of the client, for example supported crypto */
+
+@ext - external flags. Used by different commands to specify some extra arguments
+	like partial size of the embedded objects or creation flags.
+
+@size - size of the attached data. For NETFS_READ_PAGE and NETFS_READ_PAGES no data is attached,
+	but size of the requested data is incorporated here. It does not include size of the command
+	header (struct netfs_cmd) itself.
+
+@id - id of the object this command operates on. Each command can use it for own purpose.
+
+@start - start of the object this command operates on. Each command can use it for own purpose.
+
+@csize, @cpad - size and padding size of the (attached if needed) crypto information.
+
+Command specifications.
+
+@NETFS_READDIR
+This command is used to sync content of the remote dir to the client.
+
+@ext - length of the path to object.
+@size - the same.
+@id - local inode number of the directory to read.
+@start - zero.
+
+
+@NETFS_READ_PAGE
+This command is used to read data from remote server.
+Data size does not exceed local page cache size.
+
+@id - inode number.
+@start - first byte offset.
+@size - number of bytes to read plus length of the path to object.
+@ext - object path length.
+
+
+@NETFS_CREATE
+Used to create object.
+It does not require that all directories on top of the object were
+already created, it will create them automatically. Each object has
+associated @netfs_path_entry data structure, which contains creation
+mode (permissions and type) and length of the name as long as name itself.
+
+@start - 0
+@size - size of the all data structures needed to create a path
+@id - local inode number
+@ext - 0
+
+
+@NETFS_REMOVE
+Used to remove object.
+
+@ext - length of the path to object.
+@size - the same.
+@id - local inode number.
+@start - zero.
+
+
+@NETFS_LOOKUP
+Lookup information about object on server.
+
+@ext - length of the path to object.
+@size - the same.
+@id - local inode number of the directory to look object in.
+@start - local inode number of the object to look at.
+
+
+@NETFS_LINK
+Create hard of symlink.
+Command is sent as "object_path|target_path".
+
+@size - size of the above string.
+@id - parent local inode number.
+@start - 1 for symlink, 0 for hardlink.
+@ext - size of the "object_path" above.
+
+
+@NETFS_TRANS
+Transaction header.
+
+@size - incorporates all embedded command sizes including theirs header sizes.
+@start - transaction generation number - unique id used to find transaction.
+@ext - transaction flags. Unused at the moment.
+@id - 0.
+
+
+@NETFS_OPEN
+Open intent for given transaction.
+
+@id - local inode number.
+@start - 0.
+@size - path length to the object.
+@ext - open flags (O_RDWR and so on).
+
+
+@NETFS_INODE_INFO
+Metadata update command.
+It is sent to servers when attributes of the object are changed and received
+when data or metadata were updated. It operates with the following structure:
+
+struct netfs_inode_info
+{
+	unsigned int		mode;
+	unsigned int		nlink;
+	unsigned int		uid;
+	unsigned int		gid;
+	unsigned int		blocksize;
+	unsigned int		padding;
+	__u64			ino;
+	__u64			blocks;
+	__u64			rdev;
+	__u64			size;
+	__u64			version;
+};
+
+It effectively mirrors stat(2) returned data.
+
+
+@ext - path length to the object.
+@size - the same plus size of the netfs_inode_info structure.
+@id - local inode number.
+@start - 0.
+
+
+@NETFS_JOIN_GROUP/NETFS_LEAVE_GROUP
+Metadata cache coherency synchronization messages.
+They are broadcasted when new inode is created (either for new object
+or object read from the server), so that server new inode number of the
+object on the appropriate client. @NETFS_LEAVE_GROUP is sent when local
+inode is destroyed, so that client physically can not be interested in
+data or metadata updates for given inode.
+
+@ext - path length to the object.
+@size - the same.
+@id - local inode number for given object.
+@start - 0.
+
+
+@NETFS_PAGE_CACHE
+Command is only received by clients. It contains information about
+page to be marked as not up-to-date. Server fills this command with data,
+provided by above @NETFS_JOIN_GROUP command.
+
+@id - client's inode number.
+@start - last byte of the page to be invalidated. If it is not equal to
+	current inode size, it will be vmtruncated().
+@size - 0
+@ext - 0
+
+
+@NETFS_READ_PAGES
+Used to read multiple contiguous pages in one go.
+
+@start - first byte of the contiguous region to read.
+@size - contains of two fields: lower 8 bits are used to represent page cache shift
+	used by client, another 3 bytes are used to get number of pages.
+@id - local inode number.
+@ext - path length to the object.
+
+
+@NETFS_RENAME
+Used to rename object.
+Attached data is formed into following string: "old_path|new_path".
+
+@id - local inode number.
+@start - parent inode number.
+@size - length of the above string.
+@ext - length of the old path part.
+
+
+@NETFS_CAPABILITIES
+Used to exchange crypto capabilities with server.
+If crypto capabilities are not supported by server, then client will disable it
+or fail (if 'crypto_fail_unsupported' mount options was specified).
+
+@id - superblock index. Used to specify crypto information for group of servers.
+@size - size of the attached capabilities structure.
+@start - 0.
+@size - 0.
+@scsize - 0.
diff --git a/fs/Kconfig b/fs/Kconfig
index c509123..59935cd 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -1566,6 +1566,8 @@ menuconfig NETWORK_FILESYSTEMS
 
 if NETWORK_FILESYSTEMS
 
+source "fs/pohmelfs/Kconfig"
+
 config NFS_FS
 	tristate "NFS file system support"
 	depends on INET
diff --git a/fs/Makefile b/fs/Makefile
index 1e7a11b..6ce6a35 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -119,3 +119,4 @@ obj-$(CONFIG_HPPFS)		+= hppfs/
 obj-$(CONFIG_DEBUG_FS)		+= debugfs/
 obj-$(CONFIG_OCFS2_FS)		+= ocfs2/
 obj-$(CONFIG_GFS2_FS)           += gfs2/
+obj-$(CONFIG_POHMELFS)		+= pohmelfs/
diff --git a/fs/pohmelfs/Kconfig b/fs/pohmelfs/Kconfig
new file mode 100644
index 0000000..61f66b0
--- /dev/null
+++ b/fs/pohmelfs/Kconfig
@@ -0,0 +1,33 @@
+config POHMELFS
+	tristate "POHMELFS filesystem support"
+	select CONNECTOR
+	help
+	  POHMELFS stands for Parallel Optimized Host Message Exchange Layered File System.
+	  This is a network filesystem which supports coherent caching of data and metadata
+	  on clients.
+
+config POHMELFS_DEBUG
+	bool "POHMELFS debugging"
+	depends on POHMELFS
+	default n
+	help
+	  Turns on excessive POHMELFS debugging facilities.
+	  You usually do not want to slow things down noticebly and get really lots of kernel
+	  messages in syslog.
+
+config POHMELFS_CC_GROUP
+	bool "POHMELFS cache coherency protocol"
+	depends on POHMELFS
+	default y
+	help
+	  This allows to broadcast data and metadata cache coherency messages between clients.
+	  Usually you want this facility, although without locking you can get different from
+	  POSIX expectation behaviour. For more details check POHMELFS homepage and development
+	  section.
+
+config POHMELFS_CRYPTO
+	bool "POHMELFS crypto support"
+	depends on CONFIG_CRYPTO_BLKCIPHER && CONFIG_CRYPTO_HASH
+	help
+	  This option allows to encrypt and/or protect with strong cryptographic hash all dataflow
+	  between server and clients. Each config group can have own keys.
diff --git a/fs/pohmelfs/Makefile b/fs/pohmelfs/Makefile
new file mode 100644
index 0000000..1d1325b
--- /dev/null
+++ b/fs/pohmelfs/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_POHMELFS)	+= pohmelfs.o
+
+pohmelfs-y := inode.o config.o dir.o net.o path_entry.o trans.o crypto.o
diff --git a/fs/pohmelfs/config.c b/fs/pohmelfs/config.c
new file mode 100644
index 0000000..f96dbc9
--- /dev/null
+++ b/fs/pohmelfs/config.c
@@ -0,0 +1,393 @@
+/*
+ * 2007+ Copyright (c) Evgeniy Polyakov <johnpol@2ka.mipt.ru>
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/kernel.h>
+#include <linux/connector.h>
+#include <linux/crypto.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/string.h>
+
+#include "netfs.h"
+
+/*
+ * Global configuration list.
+ * Each client can be asked to get one of them.
+ *
+ * Allows to provide remote server address (ipv4/v6/whatever), port
+ * and so on via kernel connector.
+ */
+
+static struct cb_id pohmelfs_cn_id = {.idx = POHMELFS_CN_IDX, .val = POHMELFS_CN_VAL};
+static LIST_HEAD(pohmelfs_config_list);
+static DEFINE_MUTEX(pohmelfs_config_lock);
+
+static inline int pohmelfs_config_eql(struct pohmelfs_ctl *sc, struct pohmelfs_ctl *ctl)
+{
+	if (sc->idx == ctl->idx && sc->type == ctl->type &&
+			sc->proto == ctl->proto &&
+			sc->addrlen == ctl->addrlen &&
+			!memcmp(&sc->addr, &ctl->addr, ctl->addrlen))
+		return 1;
+
+	return 0;
+}
+
+static struct pohmelfs_config_group *pohmelfs_find_config_group(unsigned int idx)
+{
+	struct pohmelfs_config_group *g, *group = NULL;
+
+	list_for_each_entry(g, &pohmelfs_config_list, group_entry) {
+		if (g->idx == idx) {
+			group = g;
+			break;
+		}
+	}
+
+	return group;
+}
+
+static struct pohmelfs_config_group *pohmelfs_find_create_config_group(unsigned int idx)
+{
+	struct pohmelfs_config_group *g;
+	
+	g = pohmelfs_find_config_group(idx);
+	if (g)
+		return g;
+
+	g = kzalloc(sizeof(struct pohmelfs_config_group), GFP_KERNEL);
+	if (!g)
+		return NULL;
+
+	INIT_LIST_HEAD(&g->config_list);
+	g->idx = idx;
+
+	list_add_tail(&g->group_entry, &pohmelfs_config_list);
+
+	return g;
+}
+
+int pohmelfs_copy_config(struct pohmelfs_sb *psb)
+{
+	struct pohmelfs_config_group *g;
+	struct pohmelfs_config *c, *dst;
+	int err = -ENODEV;
+
+	mutex_lock(&pohmelfs_config_lock);
+
+	g = pohmelfs_find_config_group(psb->idx);
+	if (!g)
+		goto out_unlock;
+
+	/*
+	 * Run over all entries in given config group and try to crate and
+	 * initialize those, which do not exist in superblock list.
+	 * Skip all existing entries.
+	 */
+
+	list_for_each_entry(c, &g->config_list, config_entry) {
+		err = 0;
+		list_for_each_entry(dst, &psb->state_list, config_entry) {
+			if (pohmelfs_config_eql(&dst->state.ctl, &c->state.ctl)) {
+				err = -EEXIST;
+				break;
+			}
+		}
+
+		if (err)
+			continue;
+
+		dst = kzalloc(sizeof(struct pohmelfs_config), GFP_KERNEL);
+		if (!dst) {
+			err = -ENOMEM;
+			break;
+		}
+
+		memcpy(&dst->state.ctl, &c->state.ctl, sizeof(struct pohmelfs_ctl));
+
+		list_add_tail(&dst->config_entry, &psb->state_list);
+
+		err = pohmelfs_state_init_one(psb, dst);
+		if (err) {
+			list_del(&dst->config_entry);
+			kfree(dst);
+		}
+
+		err = 0;
+	}
+
+out_unlock:
+	mutex_unlock(&pohmelfs_config_lock);
+
+	return err;
+}
+
+int pohmelfs_copy_crypto(struct pohmelfs_sb *psb)
+{
+	struct pohmelfs_config_group *g;
+	int err = -ENOENT;
+
+	mutex_lock(&pohmelfs_config_lock);
+	g = pohmelfs_find_config_group(psb->idx);
+	if (g) {
+		psb->hash_string = g->hash_string;
+		psb->hash_strlen = g->hash_strlen;
+		g->hash_string = NULL;
+		g->hash_strlen = 0;
+
+		psb->cipher_string = g->cipher_string;
+		psb->cipher_strlen = g->cipher_strlen;
+		g->cipher_string = NULL;
+		g->cipher_strlen = 0;
+		
+		psb->hash_key = g->hash_key;
+		psb->hash_keysize = g->hash_keysize;
+		g->hash_key = NULL;
+		g->hash_keysize = 0;
+		
+		psb->cipher_key = g->cipher_key;
+		psb->cipher_keysize = g->cipher_keysize;
+		g->cipher_key = NULL;
+		g->cipher_keysize = 0;
+
+		err = 0;
+	}
+	mutex_unlock(&pohmelfs_config_lock);
+
+	return err;
+}
+
+static int pohmelfs_cn_ctl(struct cn_msg *msg)
+{
+	struct pohmelfs_config_group *g;
+	struct pohmelfs_ctl *ctl = (struct pohmelfs_ctl *)msg->data;
+	struct pohmelfs_config *c, *tmp;
+	int err = 0;
+
+	if (msg->len != sizeof(struct pohmelfs_ctl))
+		return -EBADMSG;
+
+	mutex_lock(&pohmelfs_config_lock);
+
+	g = pohmelfs_find_create_config_group(ctl->idx);
+	if (!g) {
+		err = -ENOMEM;
+		goto out_unlock;
+	}
+
+	list_for_each_entry_safe(c, tmp, &g->config_list, config_entry) {
+		struct pohmelfs_ctl *sc = &c->state.ctl;
+
+		if (pohmelfs_config_eql(sc, ctl)) {
+			list_del(&c->config_entry);
+			kfree(c);
+			err = -EEXIST;
+			goto out_unlock;
+		}
+	}
+
+	c = kzalloc(sizeof(struct pohmelfs_config), GFP_KERNEL);
+	if (!c) {
+		err = -ENOMEM;
+		goto out_unlock;
+	}
+
+	err = 0;
+	memcpy(&c->state.ctl, ctl, sizeof(struct pohmelfs_ctl));
+	list_add_tail(&c->config_entry, &g->config_list);
+
+out_unlock:
+	mutex_unlock(&pohmelfs_config_lock);
+
+	return err;
+}
+
+static int pohmelfs_crypto_hash_init(struct pohmelfs_config_group *g, struct pohmelfs_crypto *c)
+{
+	char *algo = (char *)c->data;
+	u8 *key = (u8 *)(algo + c->strlen);
+
+	if (g->hash_string)
+		return -EEXIST;
+
+	g->hash_string = kstrdup(algo, GFP_KERNEL);
+	if (!g->hash_string)
+		return -ENOMEM;
+	g->hash_strlen = c->strlen;
+	g->hash_keysize = c->keysize;
+
+	g->hash_key = kmalloc(c->keysize, GFP_KERNEL);
+	if (!g->hash_key) {
+		kfree(g->hash_string);
+		return -ENOMEM;
+	}
+
+	memcpy(g->hash_key, key, c->keysize);
+
+	return 0;
+}
+
+static int pohmelfs_crypto_cipher_init(struct pohmelfs_config_group *g, struct pohmelfs_crypto *c)
+{
+	char *algo = (char *)c->data;
+	u8 *key = (u8 *)(algo + c->strlen);
+
+	if (g->cipher_string)
+		return -EEXIST;
+
+	g->cipher_string = kstrdup(algo, GFP_KERNEL);
+	if (!g->cipher_string)
+		return -ENOMEM;
+	g->cipher_strlen = c->strlen;
+	g->cipher_keysize = c->keysize;
+
+	g->cipher_key = kmalloc(c->keysize, GFP_KERNEL);
+	if (!g->cipher_key) {
+		kfree(g->cipher_string);
+		return -ENOMEM;
+	}
+
+	memcpy(g->cipher_key, key, c->keysize);
+
+	return 0;
+}
+
+
+static int pohmelfs_cn_crypto(struct cn_msg *msg)
+{
+	struct pohmelfs_crypto *crypto = (struct pohmelfs_crypto *)msg->data;
+	struct pohmelfs_config_group *g;
+	int err = 0;
+
+	dprintk("%s: idx: %u, strlen: %u, type: %u, keysize: %u, algo: %s.\n",
+			__func__, crypto->idx, crypto->strlen, crypto->type,
+			crypto->keysize, (char *)crypto->data);
+
+	mutex_lock(&pohmelfs_config_lock);
+	g = pohmelfs_find_create_config_group(crypto->idx);
+	if (!g) {
+		err = -ENOMEM;
+		goto out_unlock;
+	}
+
+	switch (crypto->type) {
+		case POHMELFS_CRYPTO_HASH:
+			err = pohmelfs_crypto_hash_init(g, crypto);
+			break;
+		case POHMELFS_CRYPTO_CIPHER:
+			err = pohmelfs_crypto_cipher_init(g, crypto);
+			break;
+		default:
+			err = -ENOTSUPP;
+			break;
+	}
+
+out_unlock:
+	mutex_unlock(&pohmelfs_config_lock);
+
+	return err;
+}
+
+static void pohmelfs_cn_callback(void *data)
+{
+	struct cn_msg *msg = data;
+	struct pohmelfs_cn_ack *ack;
+	int err;
+
+	switch (msg->flags) {
+		case POHMELFS_FLAGS_CTL:
+			err = pohmelfs_cn_ctl(msg);
+			break;
+		case POHMELFS_FLAGS_CRYPTO:
+			err = pohmelfs_cn_crypto(msg);
+			break;
+		default:
+			err = -ENOSYS;
+			break;
+	}
+
+	ack = kmalloc(sizeof(struct pohmelfs_cn_ack), GFP_KERNEL);
+	if (!ack)
+		return;
+
+	memcpy(&ack->msg, msg, sizeof(struct cn_msg));
+
+	ack->msg.ack = msg->ack + 1;
+	ack->msg.len = sizeof(struct pohmelfs_cn_ack) - sizeof(struct cn_msg);
+
+	ack->error = err;
+
+	cn_netlink_send(&ack->msg, 0, GFP_KERNEL);
+	kfree(ack);
+}
+
+int pohmelfs_config_check(struct pohmelfs_config *config, int idx)
+{
+	struct pohmelfs_ctl *ctl = &config->state.ctl;
+	struct pohmelfs_config *tmp;
+	int err = -ENOENT;
+	struct pohmelfs_ctl *sc;
+	struct pohmelfs_config_group *g;
+
+	mutex_lock(&pohmelfs_config_lock);
+
+	g = pohmelfs_find_config_group(ctl->idx);
+	if (g) {
+		list_for_each_entry(tmp, &g->config_list, config_entry) {
+			sc = &tmp->state.ctl;
+
+			if (pohmelfs_config_eql(sc, ctl)) {
+				err = 0;
+				break;
+			}
+		}
+	}
+
+	mutex_unlock(&pohmelfs_config_lock);
+
+	return err;
+}
+
+int __init pohmelfs_config_init(void)
+{
+	return cn_add_callback(&pohmelfs_cn_id, "pohmelfs", pohmelfs_cn_callback);
+}
+
+void __exit pohmelfs_config_exit(void)
+{
+	struct pohmelfs_config *c, *tmp;
+	struct pohmelfs_config_group *g, *gtmp;
+
+	cn_del_callback(&pohmelfs_cn_id);
+
+	mutex_lock(&pohmelfs_config_lock);
+	list_for_each_entry_safe(g, gtmp, &pohmelfs_config_list, group_entry) {
+		list_for_each_entry_safe(c, tmp, &g->config_list, config_entry) {
+			list_del(&c->config_entry);
+			kfree(c);
+		}
+
+		list_del(&g->group_entry);
+
+		if (g->hash_string)
+			kfree(g->hash_string);
+
+		if (g->cipher_string)
+			kfree(g->cipher_string);
+
+		kfree(g);
+	}
+	mutex_unlock(&pohmelfs_config_lock);
+}
diff --git a/fs/pohmelfs/crypto.c b/fs/pohmelfs/crypto.c
new file mode 100644
index 0000000..bc5f010
--- /dev/null
+++ b/fs/pohmelfs/crypto.c
@@ -0,0 +1,862 @@
+/*
+ * 2007+ Copyright (c) Evgeniy Polyakov <johnpol@2ka.mipt.ru>
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/crypto.h>
+#include <linux/highmem.h>
+#include <linux/kthread.h>
+#include <linux/pagemap.h>
+#include <linux/slab.h>
+
+#include "netfs.h"
+
+static struct crypto_hash *pohmelfs_init_hash(struct pohmelfs_sb *psb)
+{
+	int err;
+	struct crypto_hash *hash;
+
+	hash = crypto_alloc_hash(psb->hash_string, 0, CRYPTO_ALG_ASYNC);
+	if (IS_ERR(hash)) {
+		err = PTR_ERR(hash);
+		dprintk("%s: idx: %u: failed to allocate hash '%s', err: %d.\n",
+				__func__, psb->idx, psb->hash_string, err);
+		goto err_out_exit;
+	}
+
+	psb->crypto_attached_size = crypto_hash_digestsize(hash);
+
+	if (!psb->hash_keysize)
+		return hash;
+
+	err = crypto_hash_setkey(hash, psb->hash_key, psb->hash_keysize);
+	if (err) {
+		dprintk("%s: idx: %u: failed to set key for hash '%s', err: %d.\n",
+				__func__, psb->idx, psb->hash_string, err);
+		goto err_out_free;
+	}
+
+	return hash;
+
+err_out_free:
+	crypto_free_hash(hash);
+err_out_exit:
+	return ERR_PTR(err);
+}
+
+static struct crypto_ablkcipher *pohmelfs_init_cipher(struct pohmelfs_sb *psb)
+{
+	int err = -EINVAL;
+	struct crypto_ablkcipher *cipher;
+
+	if (!psb->cipher_keysize)
+		goto err_out_exit;
+
+	cipher = crypto_alloc_ablkcipher(psb->cipher_string, 0, 0);
+	if (IS_ERR(cipher)) {
+		err = PTR_ERR(cipher);
+		dprintk("%s: idx: %u: failed to allocate cipher '%s', err: %d.\n",
+				__func__, psb->idx, psb->cipher_string, err);
+		goto err_out_exit;
+	}
+
+	crypto_ablkcipher_clear_flags(cipher, ~0);
+
+	err = crypto_ablkcipher_setkey(cipher, psb->cipher_key, psb->cipher_keysize);
+	if (err) {
+		dprintk("%s: idx: %u: failed to set key for cipher '%s', err: %d.\n",
+				__func__, psb->idx, psb->cipher_string, err);
+		goto err_out_free;
+	}
+
+	return cipher;
+
+err_out_free:
+	crypto_free_ablkcipher(cipher);
+err_out_exit:
+	return ERR_PTR(err);
+}
+
+int pohmelfs_crypto_engine_init(struct pohmelfs_crypto_engine *e, struct pohmelfs_sb *psb)
+{
+	int err;
+
+	e->page_num = 0;
+
+	e->size = PAGE_SIZE;
+	e->data = kmalloc(e->size, GFP_KERNEL);
+	if (!e->data) {
+		err = -ENOMEM;
+		goto err_out_exit;
+	}
+
+	if (psb->hash_string) {
+		e->hash = pohmelfs_init_hash(psb);
+		if (IS_ERR(e->hash)) {
+			err = PTR_ERR(e->hash);
+			e->hash = NULL;
+			goto err_out_free;
+		}
+	}
+	
+	if (psb->cipher_string) {
+		e->cipher = pohmelfs_init_cipher(psb);
+		if (IS_ERR(e->cipher)) {
+			err = PTR_ERR(e->cipher);
+			e->cipher = NULL;
+			goto err_out_free_hash;
+		}
+	}
+
+	return 0;
+
+err_out_free_hash:
+	crypto_free_hash(e->hash);
+err_out_free:
+	kfree(e->data);
+err_out_exit:
+	return err;
+}
+
+void pohmelfs_crypto_engine_exit(struct pohmelfs_crypto_engine *e)
+{
+	if (e->hash)
+		crypto_free_hash(e->hash);
+	if (e->cipher)
+		crypto_free_ablkcipher(e->cipher);
+	kfree(e->data);
+}
+
+static void pohmelfs_crypto_complete(struct crypto_async_request *req, int err)
+{
+	struct pohmelfs_crypto_completion *c = req->data;
+
+	if (err == -EINPROGRESS)
+		return;
+
+	dprintk("%s: req: %p, err: %d.\n", __func__, req, err);
+	c->error = err;
+	complete(&c->complete);
+}
+
+static int pohmelfs_crypto_process(struct ablkcipher_request *req,
+		struct scatterlist *sg_dst, struct scatterlist *sg_src,
+		void *iv, int enc, unsigned long timeout)
+{
+	struct pohmelfs_crypto_completion complete;
+	int err;
+
+	init_completion(&complete.complete);
+	complete.error = -EINPROGRESS;
+
+	ablkcipher_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
+					pohmelfs_crypto_complete, &complete);
+
+	ablkcipher_request_set_crypt(req, sg_src, sg_dst, sg_src->length, iv);
+
+	if (enc)
+		err = crypto_ablkcipher_encrypt(req);
+	else
+		err = crypto_ablkcipher_decrypt(req);
+
+	switch (err) {
+		case -EINPROGRESS:
+		case -EBUSY:
+			err = wait_for_completion_interruptible_timeout(&complete.complete,
+					timeout);
+			if (!err)
+				err = -ETIMEDOUT;
+			else
+				err = complete.error;
+			break;
+		default:
+			break;
+	}
+
+	return err;
+}
+
+int pohmelfs_crypto_process_input_data(struct pohmelfs_crypto_engine *e, u64 cmd_iv,
+		void *data, struct page *page, unsigned int size)
+{
+	int err;
+	struct scatterlist sg;
+
+	dprintk("%s: eng: %p, iv: %llx, data: %p, page: %p/%lu, size: %u: ",
+		__func__, e, cmd_iv, data, page, (page)?page->index:0, size);
+
+	if (data) {
+		sg_init_one(&sg, data, size);
+	} else {
+		sg_init_table(&sg, 1);
+		sg_set_page(&sg, page, size, 0);
+	}
+
+	if (e->cipher) {
+		struct ablkcipher_request *req = e->data + crypto_hash_digestsize(e->hash);
+		u8 iv[32];
+
+		memset(iv, 0, sizeof(iv));
+		memcpy(iv, &cmd_iv, sizeof(cmd_iv));
+
+		ablkcipher_request_set_tfm(req, e->cipher);
+
+		err = pohmelfs_crypto_process(req, &sg, &sg, iv, 0, e->timeout);
+		if (err)
+			goto err_out_exit;
+	}
+
+	if (e->hash) {
+		struct hash_desc desc;
+		void *dst = e->data + e->size/2;
+
+		desc.tfm = e->hash;
+		desc.flags = 0;
+		
+		err = crypto_hash_init(&desc);
+		if (err)
+			goto err_out_exit;
+
+		err = crypto_hash_update(&desc, &sg, size);
+		if (err)
+			goto err_out_exit;
+		
+		err = crypto_hash_final(&desc, dst);
+		if (err)
+			goto err_out_exit;
+
+		err = !!memcmp(dst, e->data, crypto_hash_digestsize(e->hash));
+
+		if (err) {
+#ifdef CONFIG_POHMELFS_DEBUG
+			unsigned int i;
+			unsigned char *recv = e->data, *calc = dst;
+
+			dprintk("%s: eng: %p, hash: %p, cipher: %p: iv : %llx, hash mismatch (recv/calc): ",
+					__func__, e, e->hash, e->cipher, cmd_iv);
+			for (i=0; i<crypto_hash_digestsize(e->hash); ++i) {
+#if 0
+				dprintk("%02x ", recv[i]);
+				if (recv[i] != calc[i]) {
+					dprintk("| calc byte: %02x.\n", calc[i]);
+					break;
+				}
+#else
+				dprintk("%02x/%02x ", recv[i], calc[i]);
+#endif
+			}
+			dprintk("\n");
+#endif
+			goto err_out_exit;
+		} else {
+			dprintk("%s: eng: %p, hash: %p, cipher: %p: hashes matched.\n",
+					__func__, e, e->hash, e->cipher);
+		}
+	}
+	
+	dprintk("%s: eng: %p, hash: %p, cipher: %p: completed.\n",
+			__func__, e, e->hash, e->cipher);
+
+	return 0;
+
+err_out_exit:
+	dprintk("%s: eng: %p, hash: %p, cipher: %p: err: %d.\n",
+			__func__, e, e->hash, e->cipher, err);
+	return err;
+}
+
+static int pohmelfs_trans_iter(struct netfs_trans *t, struct pohmelfs_crypto_engine *e,
+		int (* iterator) (struct pohmelfs_crypto_engine *e,
+				  struct scatterlist *dst,
+				  struct scatterlist *src))
+{
+	void *data = t->iovec.iov_base + sizeof(struct netfs_cmd) + t->psb->crypto_attached_size;
+	unsigned int size = t->iovec.iov_len - sizeof(struct netfs_cmd) - t->psb->crypto_attached_size;
+	struct netfs_cmd *cmd = data;
+	unsigned int sz, pages = t->attached_pages, i, csize, cmd_cmd, dpage_idx;
+	struct scatterlist sg_src, sg_dst;
+	int err;
+
+	while (size) {
+		cmd = data;
+		cmd_cmd = __be16_to_cpu(cmd->cmd);
+		csize = __be32_to_cpu(cmd->size);
+		cmd->iv = __cpu_to_be64(e->iv);
+
+		if (cmd_cmd == NETFS_READ_PAGES || cmd_cmd == NETFS_READ_PAGE)
+			csize = __be16_to_cpu(cmd->ext);
+
+		sz = csize + __be16_to_cpu(cmd->cpad) + sizeof(struct netfs_cmd);
+		
+		dprintk("%s: size: %u, sz: %u, cmd_size: %u, cmd_cpad: %u.\n",
+				__func__, size, sz, __be32_to_cpu(cmd->size), __be16_to_cpu(cmd->cpad));
+
+		data += sz;
+		size -= sz;
+		
+		sg_init_one(&sg_src, cmd->data, sz - sizeof(struct netfs_cmd));
+		sg_init_one(&sg_dst, cmd->data, sz - sizeof(struct netfs_cmd));
+
+		err = iterator(e, &sg_dst, &sg_src);
+		if (err)
+			return err;
+	}
+	
+	if (!pages)
+		return 0;
+
+	dpage_idx = 0;
+	for (i=0; i<t->page_num; ++i) {
+		struct page *page = t->pages[i];
+		struct page *dpage = e->pages[dpage_idx];
+
+		if (!page)
+			continue;
+
+		sg_init_table(&sg_src, 1);
+		sg_init_table(&sg_dst, 1);
+		sg_set_page(&sg_src, page, page_private(page), 0);
+		sg_set_page(&sg_dst, dpage, page_private(page), 0);
+
+		err = iterator(e, &sg_dst, &sg_src);
+		if (err)
+			return err;
+
+		pages--;
+		if (!pages)
+			break;
+		dpage_idx++;
+	}
+
+	return 0;
+}
+
+static int pohmelfs_encrypt_iterator(struct pohmelfs_crypto_engine *e,
+		struct scatterlist *sg_dst, struct scatterlist *sg_src)
+{
+	struct ablkcipher_request *req = e->data;
+	u8 iv[32];
+
+	memset(iv, 0, sizeof(iv));
+
+	memcpy(iv, &e->iv, sizeof(e->iv));
+
+	return pohmelfs_crypto_process(req, sg_dst, sg_src, iv, 1, e->timeout);
+}
+
+static int pohmelfs_encrypt(struct pohmelfs_crypto_thread *tc)
+{
+	struct netfs_trans *t = tc->trans;
+	struct pohmelfs_crypto_engine *e = &tc->eng;
+	struct ablkcipher_request *req = e->data;
+
+	memset(req, 0, sizeof(struct ablkcipher_request));
+	ablkcipher_request_set_tfm(req, e->cipher);
+
+	e->iv = pohmelfs_gen_iv(t);
+
+	return pohmelfs_trans_iter(t, e, pohmelfs_encrypt_iterator);
+}
+
+static int pohmelfs_hash_iterator(struct pohmelfs_crypto_engine *e,
+		struct scatterlist *sg_dst, struct scatterlist *sg_src)
+{
+	return crypto_hash_update(e->data, sg_src, sg_src->length);
+}
+
+static int pohmelfs_hash(struct pohmelfs_crypto_thread *tc)
+{
+	struct pohmelfs_crypto_engine *e = &tc->eng;
+	struct hash_desc *desc = e->data;
+	unsigned char *dst = tc->trans->iovec.iov_base + sizeof(struct netfs_cmd);
+	int err;
+
+	desc->tfm = e->hash;
+	desc->flags = 0;
+
+	err = crypto_hash_init(desc);
+	if (err)
+		return err;
+
+	err = pohmelfs_trans_iter(tc->trans, e, pohmelfs_hash_iterator);
+	if (err)
+		return err;
+
+	err = crypto_hash_final(desc, dst);
+	if (err)
+		return err;
+
+	{
+		unsigned int i;
+		dprintk("%s: ", __func__);
+		for (i=0; i<tc->psb->crypto_attached_size; ++i)
+			dprintk("%02x ", dst[i]);
+		dprintk("\n");
+	}
+
+	return 0;
+}
+
+static void pohmelfs_crypto_pages_free(struct pohmelfs_crypto_engine *e)
+{
+	unsigned int i;
+
+	for (i=0; i<e->page_num; ++i)
+		__free_page(e->pages[i]);
+	kfree(e->pages);
+}
+
+static int pohmelfs_crypto_pages_alloc(struct pohmelfs_crypto_engine *e, struct pohmelfs_sb *psb)
+{
+	unsigned int i;
+
+	e->pages = kmalloc(psb->trans_max_pages * sizeof(struct page *), GFP_KERNEL);
+	if (!e->pages)
+		return -ENOMEM;
+
+	for (i=0; i<psb->trans_max_pages; ++i) {
+		e->pages[i] = alloc_page(GFP_KERNEL);
+		if (!e->pages[i])
+			break;
+	}
+
+	e->page_num = i;
+	if (!e->page_num)
+		goto err_out_free;
+
+	return 0;
+
+err_out_free:
+	kfree(e->pages);
+	return -ENOMEM;
+}
+
+static void pohmelfs_sys_crypto_exit_one(struct pohmelfs_crypto_thread *t)
+{
+	if (t->thread)
+		kthread_stop(t->thread);
+	pohmelfs_crypto_engine_exit(&t->eng);
+	pohmelfs_crypto_pages_free(&t->eng);
+	kfree(t);
+}
+
+static int pohmelfs_crypto_finish(struct netfs_trans *t, struct pohmelfs_sb *psb, int err)
+{
+	if (likely(!err)) {
+		struct netfs_cmd *cmd = t->iovec.iov_base;
+		netfs_convert_cmd(cmd);
+
+		err = netfs_trans_finish_send(t, psb);
+	}
+	t->result = err;
+	netfs_trans_put(t);
+
+	return err;
+}
+
+void pohmelfs_crypto_thread_make_ready(struct pohmelfs_crypto_thread *th)
+{
+	struct pohmelfs_sb *psb = th->psb;
+
+	th->page = NULL;
+	th->trans = NULL;
+
+	mutex_lock(&psb->crypto_thread_lock);
+	list_move_tail(&th->thread_entry, &psb->crypto_ready_list);
+	mutex_unlock(&psb->crypto_thread_lock);
+	wake_up(&psb->wait);
+}
+
+static int pohmelfs_crypto_thread_trans(struct pohmelfs_crypto_thread *t)
+{
+	struct netfs_trans *trans;
+	int err = 0;
+
+	trans = t->trans;
+	trans->eng = NULL;
+
+	if (t->eng.hash) {
+		err = pohmelfs_hash(t);
+		if (err)
+			goto out_complete;
+	}
+
+	if (t->eng.cipher) {
+		err = pohmelfs_encrypt(t);
+		if (err)
+			goto out_complete;
+		trans->eng = &t->eng;
+	}
+
+out_complete:
+	t->page = NULL;
+	t->trans = NULL;
+
+	if (!trans->eng)
+		pohmelfs_crypto_thread_make_ready(t);
+
+	pohmelfs_crypto_finish(trans, t->psb, err);
+	return err;
+}
+
+static int pohmelfs_crypto_thread_page(struct pohmelfs_crypto_thread *t)
+{
+	struct pohmelfs_crypto_engine *e = &t->eng;
+	struct page *page = t->page;
+	int err;
+
+	WARN_ON(!PageChecked(page));
+
+	err = pohmelfs_crypto_process_input_data(e, e->iv, NULL, page, t->size);
+	if (!err)
+		SetPageUptodate(page);
+	else
+		SetPageError(page);
+	unlock_page(page);
+	page_cache_release(page);
+
+	pohmelfs_crypto_thread_make_ready(t);
+
+	return err;
+}
+
+static int pohmelfs_crypto_thread_func(void *data)
+{
+	struct pohmelfs_crypto_thread *t = data;
+
+	while (!kthread_should_stop()) {
+		wait_event_interruptible(t->wait, kthread_should_stop() ||
+				t->trans || t->page);
+
+		if (kthread_should_stop())
+			break;
+
+		if (!t->trans && !t->page)
+			continue;
+
+		dprintk("%s: thread: %p, trans: %p, page: %p.\n",
+				__func__, t, t->trans, t->page);
+
+		if (t->trans)
+			pohmelfs_crypto_thread_trans(t);
+		else if (t->page)
+			pohmelfs_crypto_thread_page(t);
+	}
+
+	return 0;
+}
+
+static void pohmelfs_sys_crypto_exit(struct pohmelfs_sb *psb)
+{
+	struct pohmelfs_crypto_thread *t, *tmp;
+
+	mutex_lock(&psb->crypto_thread_lock);
+	list_for_each_entry_safe(t, tmp, &psb->crypto_active_list, thread_entry) {
+		list_del(&t->thread_entry);
+
+		pohmelfs_sys_crypto_exit_one(t);
+	}
+
+	list_for_each_entry_safe(t, tmp, &psb->crypto_ready_list, thread_entry) {
+		list_del(&t->thread_entry);
+
+		pohmelfs_sys_crypto_exit_one(t);
+	}
+	mutex_unlock(&psb->crypto_thread_lock);
+}
+
+static int pohmelfs_sys_crypto_init(struct pohmelfs_sb *psb)
+{
+	unsigned int i;
+	struct pohmelfs_crypto_thread *t;
+	struct pohmelfs_config *c;
+	struct netfs_state *st;
+	int err;
+
+	list_for_each_entry(c, &psb->state_list, config_entry) {
+		st = &c->state;
+
+		err = pohmelfs_crypto_engine_init(&st->eng, psb);
+		if (err)
+			goto err_out_exit;
+
+		dprintk("%s: st: %p, eng: %p, hash: %p, cipher: %p.\n",
+				__func__, st, &st->eng, &st->eng.hash, &st->eng.cipher);
+	}
+
+	for (i=0; i<psb->crypto_thread_num; ++i) {
+		err = -ENOMEM;
+		t = kzalloc(sizeof(struct pohmelfs_crypto_thread), GFP_KERNEL);
+		if (!t)
+			goto err_out_free_state_engines;
+
+		init_waitqueue_head(&t->wait);
+
+		t->psb = psb;
+		t->trans = NULL;
+		t->eng.thread = t;
+
+		err = pohmelfs_crypto_engine_init(&t->eng, psb);
+		if (err)
+			goto err_out_free_state_engines;
+
+		err = pohmelfs_crypto_pages_alloc(&t->eng, psb);
+		if (err)
+			goto err_out_free;
+
+		t->thread = kthread_run(pohmelfs_crypto_thread_func, t,
+				"pohmelfs-crypto-%d-%d", psb->idx, i);
+		if (IS_ERR(t->thread)) {
+			err = PTR_ERR(t->thread);
+			t->thread = NULL;
+			goto err_out_free;
+		}
+
+		if (t->eng.cipher)
+			psb->crypto_align_size = crypto_ablkcipher_blocksize(t->eng.cipher);
+
+		mutex_lock(&psb->crypto_thread_lock);
+		list_add_tail(&t->thread_entry, &psb->crypto_ready_list);
+		mutex_unlock(&psb->crypto_thread_lock);
+	}
+	
+	psb->crypto_thread_num = i;
+	return 0;
+
+err_out_free:
+	pohmelfs_sys_crypto_exit_one(t);
+err_out_free_state_engines:
+	list_for_each_entry(c, &psb->state_list, config_entry) {
+		st = &c->state;
+		pohmelfs_crypto_engine_exit(&st->eng);
+	}
+err_out_exit:
+	pohmelfs_sys_crypto_exit(psb);
+	return err;
+}
+
+void pohmelfs_crypto_exit(struct pohmelfs_sb *psb)
+{
+	pohmelfs_sys_crypto_exit(psb);
+
+	kfree(psb->hash_string);
+	kfree(psb->cipher_string);
+}
+
+static int pohmelfs_crypt_init_complete(struct page **pages, unsigned int page_num,
+		void *private, int err)
+{
+	struct pohmelfs_sb *psb = private;
+
+	psb->flags = -err;
+	dprintk("%s: err: %d, flags: %lx.\n", __func__, err, psb->flags);
+
+	wake_up(&psb->wait);
+
+	return err;
+}
+
+static int pohmelfs_crypto_init_handshake(struct pohmelfs_sb *psb)
+{
+	struct netfs_trans *t;
+	struct netfs_capabilities *cap;
+	struct netfs_cmd *cmd;
+	char *str;
+	int err = -ENOMEM, size;
+
+	size = sizeof(struct netfs_capabilities) +
+		psb->cipher_strlen + psb->hash_strlen + 2; /* 0 bytes */
+
+	t = netfs_trans_alloc(psb, size, 0, 0);
+	if (!t)
+		goto err_out_exit;
+
+	t->complete = pohmelfs_crypt_init_complete;
+	t->private = psb;
+
+	cmd = netfs_trans_current(t);
+	cap = (struct netfs_capabilities *)(cmd + 1);
+	str = (char *)(cap + 1);
+
+	cmd->cmd = NETFS_CAPABILITIES;
+	cmd->id = psb->idx;
+	cmd->size = size;
+	cmd->start = 0;
+	cmd->ext = 0;
+	cmd->csize = 0;
+
+	netfs_convert_cmd(cmd);
+	netfs_trans_update(cmd, t, size);
+
+	cap->hash_strlen = psb->hash_strlen;
+	if (cap->hash_strlen) {
+		sprintf(str, "%s", psb->hash_string);
+		str += cap->hash_strlen;
+	}
+
+	cap->cipher_strlen = psb->cipher_strlen;
+	cap->cipher_keysize = psb->cipher_keysize;
+	if (cap->cipher_strlen)
+		sprintf(str, "%s", psb->cipher_string);
+
+	netfs_convert_capabilities(cap);
+
+	psb->flags = ~0;
+	err = netfs_trans_finish(t, psb);
+	if (err)
+		goto err_out_exit;
+
+	err = wait_event_interruptible_timeout(psb->wait, (psb->flags != ~0),
+			psb->wait_on_page_timeout);
+	if (!err)
+		err = -ETIMEDOUT;
+	else
+		err = -psb->flags;
+
+	if (!err)
+		psb->perform_crypto = 1;
+	psb->flags = 0;
+
+	/*
+	 * At this point NETFS_CAPABILITIES response command
+	 * should setup superblock in a way, which is acceptible
+	 * for both client and server, so if server refuses connection,
+	 * it will send error in transaction response.
+	 */
+
+	if (err)
+		goto err_out_exit;
+
+	return 0;
+
+err_out_exit:
+	return err;
+}
+
+int pohmelfs_crypto_init(struct pohmelfs_sb *psb)
+{
+	int err;
+
+	if (!psb->cipher_string && !psb->hash_string)
+		return 0;
+
+	err = pohmelfs_crypto_init_handshake(psb);
+	if (err)
+		return err;
+
+	err = pohmelfs_sys_crypto_init(psb);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+static int pohmelfs_crypto_thread_get(struct pohmelfs_sb *psb,
+		int (* action)(struct pohmelfs_crypto_thread *t, void *data), void *data)
+{
+	struct pohmelfs_crypto_thread *t = NULL;
+	int err;
+
+	while (!t) {
+		err = wait_event_interruptible_timeout(psb->wait,
+				!list_empty(&psb->crypto_ready_list),
+				psb->wait_on_page_timeout);
+
+		t = NULL;
+		err = 0;
+		mutex_lock(&psb->crypto_thread_lock);
+		if (!list_empty(&psb->crypto_ready_list)) {
+			t = list_entry(psb->crypto_ready_list.prev,
+					struct pohmelfs_crypto_thread,
+					thread_entry);
+
+			list_move_tail(&t->thread_entry,
+					&psb->crypto_active_list);
+
+			action(t, data);
+			wake_up(&t->wait);
+
+		}
+		mutex_unlock(&psb->crypto_thread_lock);
+	}
+
+	return err;
+}
+
+static int pohmelfs_trans_crypt_action(struct pohmelfs_crypto_thread *t, void *data)
+{
+	struct netfs_trans *trans = data;
+
+	netfs_trans_get(trans);
+	t->trans = trans;
+
+	dprintk("%s: t: %p, gen: %u, thread: %p.\n", __func__, trans, trans->gen, t);
+	return 0;
+}
+
+int pohmelfs_trans_crypt(struct netfs_trans *trans, struct pohmelfs_sb *psb)
+{
+	if ((!psb->hash_string && !psb->cipher_string) || !psb->perform_crypto) {
+		netfs_trans_get(trans);
+		return pohmelfs_crypto_finish(trans, psb, 0);
+	}
+
+	return pohmelfs_crypto_thread_get(psb, pohmelfs_trans_crypt_action, trans);
+}
+
+struct pohmelfs_crypto_input_action_data
+{
+	struct page			*page;
+	struct pohmelfs_crypto_engine	*e;
+	u64				iv;
+	unsigned int			size;
+};
+
+static int pohmelfs_crypt_input_page_action(struct pohmelfs_crypto_thread *t, void *data)
+{
+	struct pohmelfs_crypto_input_action_data *act = data;
+
+	memcpy(t->eng.data, act->e->data, t->psb->crypto_attached_size);
+
+	t->size = act->size;
+	t->eng.iv = act->iv;
+
+	t->page = act->page;
+	return 0;
+}
+
+int pohmelfs_crypto_process_input_page(struct pohmelfs_crypto_engine *e,
+		struct page *page, unsigned int size, u64 iv)
+{
+	struct inode *inode = page->mapping->host;
+	struct pohmelfs_crypto_input_action_data act;
+	int err = -ENOENT;
+
+	act.page = page;
+	act.e = e;
+	act.size = size;
+	act.iv = iv;
+
+	err = pohmelfs_crypto_thread_get(POHMELFS_SB(inode->i_sb),
+			pohmelfs_crypt_input_page_action, &act);
+	if (err)
+		goto err_out_exit;
+
+	return 0;
+
+err_out_exit:
+	SetPageUptodate(page);
+	page_cache_release(page);
+
+	return err;
+}
diff --git a/fs/pohmelfs/dir.c b/fs/pohmelfs/dir.c
new file mode 100644
index 0000000..9fec116
--- /dev/null
+++ b/fs/pohmelfs/dir.c
@@ -0,0 +1,1180 @@
+/*
+ * 2007+ Copyright (c) Evgeniy Polyakov <johnpol@2ka.mipt.ru>
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/jhash.h>
+#include <linux/pagemap.h>
+
+#include "netfs.h"
+
+/*
+ * Each pohmelfs directory inode contains a tree of childrens indexed
+ * by offset (in the dir reading stream) and name hash and len. Entries
+ * of that hashes are called pohmelfs_name.
+ *
+ * This routings deal with it.
+ */
+static int pohmelfs_cmp_offset(struct pohmelfs_name *n, u64 offset)
+{
+	if (n->offset > offset)
+		return -1;
+	if (n->offset < offset)
+		return 1;
+	return 0;
+}
+
+static struct pohmelfs_name *pohmelfs_search_offset(struct pohmelfs_inode *pi, u64 offset)
+{
+	struct rb_node *n = pi->offset_root.rb_node;
+	struct pohmelfs_name *tmp;
+	int cmp;
+
+	while (n) {
+		tmp = rb_entry(n, struct pohmelfs_name, offset_node);
+
+		cmp = pohmelfs_cmp_offset(tmp, offset);
+		if (cmp < 0)
+			n = n->rb_left;
+		else if (cmp > 0)
+			n = n->rb_right;
+		else
+			return tmp;
+	}
+
+	return NULL;
+}
+
+static struct pohmelfs_name *pohmelfs_insert_offset(struct pohmelfs_inode *pi,
+		struct pohmelfs_name *new)
+{
+	struct rb_node **n = &pi->offset_root.rb_node, *parent = NULL;
+	struct pohmelfs_name *ret = NULL, *tmp;
+	int cmp;
+
+	while (*n) {
+		parent = *n;
+
+		tmp = rb_entry(parent, struct pohmelfs_name, offset_node);
+
+		cmp = pohmelfs_cmp_offset(tmp, new->offset);
+		if (cmp < 0)
+			n = &parent->rb_left;
+		else if (cmp > 0)
+			n = &parent->rb_right;
+		else {
+			ret = tmp;
+			break;
+		}
+	}
+
+	if (ret)
+		return ret;
+
+	rb_link_node(&new->offset_node, parent, n);
+	rb_insert_color(&new->offset_node, &pi->offset_root);
+
+	pi->total_len += new->len;
+
+	return NULL;
+}
+
+static int pohmelfs_cmp_hash(struct pohmelfs_name *n, u32 hash, u32 len)
+{
+	if (n->hash > hash)
+		return -1;
+	if (n->hash < hash)
+		return 1;
+
+	if (n->len > len)
+		return -1;
+	if (n->len < len)
+		return 1;
+
+	return 0;
+}
+
+static struct pohmelfs_name *pohmelfs_search_hash(struct pohmelfs_inode *pi, u32 hash, u32 len)
+{
+	struct rb_node *n = pi->hash_root.rb_node;
+	struct pohmelfs_name *tmp;
+	int cmp;
+
+	while (n) {
+		tmp = rb_entry(n, struct pohmelfs_name, hash_node);
+
+		cmp = pohmelfs_cmp_hash(tmp, hash, len);
+		if (cmp < 0)
+			n = n->rb_left;
+		else if (cmp > 0)
+			n = n->rb_right;
+		else
+			return tmp;
+	}
+
+	return NULL;
+}
+
+static void __pohmelfs_name_del(struct pohmelfs_inode *parent, struct pohmelfs_name *node)
+{
+	rb_erase(&node->offset_node, &parent->offset_root);
+	rb_erase(&node->hash_node, &parent->hash_root);
+}
+
+/*
+ * Remove name cache entry from its caches and free it.
+ */
+static void pohmelfs_name_free(struct pohmelfs_inode *parent, struct pohmelfs_name *node)
+{
+	__pohmelfs_name_del(parent, node);
+	list_del(&node->sync_del_entry);
+	list_del(&node->sync_create_entry);
+	kfree(node);
+}
+
+static struct pohmelfs_name *pohmelfs_insert_hash(struct pohmelfs_inode *pi,
+		struct pohmelfs_name *new)
+{
+	struct rb_node **n = &pi->hash_root.rb_node, *parent = NULL;
+	struct pohmelfs_name *ret = NULL, *tmp;
+	int cmp;
+
+	while (*n) {
+		parent = *n;
+
+		tmp = rb_entry(parent, struct pohmelfs_name, hash_node);
+
+		cmp = pohmelfs_cmp_hash(tmp, new->hash, new->len);
+		if (cmp < 0)
+			n = &parent->rb_left;
+		else if (cmp > 0)
+			n = &parent->rb_right;
+		else {
+			ret = tmp;
+			break;
+		}
+	}
+
+	if (ret) {
+		printk("%s: exist: ino: %llu, hash: %x, len: %u, data: '%s', new: ino: %llu, hash: %x, len: %u, data: '%s'.\n",
+				__func__, ret->ino, ret->hash, ret->len, ret->data,
+				new->ino, new->hash, new->len, new->data);
+		ret->ino = new->ino;
+		return ret;
+	}
+
+	rb_link_node(&new->hash_node, parent, n);
+	rb_insert_color(&new->hash_node, &pi->hash_root);
+
+	return NULL;
+}
+
+/*
+ * Free name cache for given inode.
+ */
+void pohmelfs_free_names(struct pohmelfs_inode *parent)
+{
+	struct rb_node *rb_node;
+	struct pohmelfs_name *n;
+
+	for (rb_node = rb_first(&parent->offset_root); rb_node;) {
+		n = rb_entry(rb_node, struct pohmelfs_name, offset_node);
+		rb_node = rb_next(rb_node);
+
+		pohmelfs_name_free(parent, n);
+	}
+}
+
+/*
+ * When name cache entry is removed (for example when object is removed),
+ * offset for all subsequent childrens has to be fixed to match new reality.
+ */
+static int pohmelfs_fix_offset(struct pohmelfs_inode *parent, struct pohmelfs_name *node)
+{
+	struct rb_node *rb_node;
+	int decr = 0;
+
+	for (rb_node = rb_next(&node->offset_node); rb_node; rb_node = rb_next(rb_node)) {
+		struct pohmelfs_name *n = container_of(rb_node, struct pohmelfs_name, offset_node);
+
+		n->offset -= node->len;
+		decr++;
+	}
+
+	parent->total_len -= node->len;
+
+	return decr;
+}
+
+/*
+ * Fix offset and free name cache entry helper.
+ */
+void pohmelfs_name_del(struct pohmelfs_inode *parent, struct pohmelfs_name *node)
+{
+	int decr;
+
+	decr = pohmelfs_fix_offset(parent, node);
+
+	dprintk("%s: parent: %llu, ino: %llu, decr: %d.\n",
+			__func__, parent->ino, node->ino, decr);
+
+	pohmelfs_name_free(parent, node);
+}
+
+/*
+ * Insert new name cache entry into all caches (offset and name hash).
+ */
+static int pohmelfs_insert_name(struct pohmelfs_inode *parent, struct pohmelfs_name *n)
+{
+	struct pohmelfs_name *name;
+
+	name = pohmelfs_insert_offset(parent, n);
+	if (name)
+		return -EEXIST;
+
+	name = pohmelfs_insert_hash(parent, n);
+	if (name) {
+		parent->total_len -= n->len;
+		rb_erase(&n->offset_node, &parent->offset_root);
+		return -EEXIST;
+	}
+
+	list_add_tail(&n->sync_create_entry, &parent->sync_create_list);
+
+	return 0;
+}
+
+/*
+ * Allocate new name cache entry.
+ */
+static struct pohmelfs_name *pohmelfs_name_clone(unsigned int len)
+{
+	struct pohmelfs_name *n;
+
+	n = kzalloc(sizeof(struct pohmelfs_name) + len, GFP_KERNEL);
+	if (!n)
+		return NULL;
+
+	INIT_LIST_HEAD(&n->sync_create_entry);
+	INIT_LIST_HEAD(&n->sync_del_entry);
+
+	n->data = (char *)(n+1);
+
+	return n;
+}
+
+/*
+ * Add new name entry into directory's cache.
+ */
+static int pohmelfs_add_dir(struct pohmelfs_sb *psb, struct pohmelfs_inode *parent,
+		struct pohmelfs_inode *npi, struct qstr *str, unsigned int mode, int link)
+{
+	int err = -ENOMEM;
+	struct pohmelfs_name *n;
+	struct pohmelfs_path_entry *e = NULL;
+
+	n = pohmelfs_name_clone(str->len + 1);
+	if (!n)
+		goto err_out_exit;
+
+	n->ino = npi->ino;
+	n->offset = parent->total_len;
+	n->mode = mode;
+	n->len = str->len;
+	n->hash = str->hash;
+	sprintf(n->data, str->name);
+
+	if (!(str->len == 1 && str->name[0] == '.') &&
+			!(str->len == 2 && str->name[0] == '.' && str->name[1] == '.')) {
+		mutex_lock(&psb->path_lock);
+		e = pohmelfs_add_path_entry(psb, parent->ino, npi->ino, str, link, mode);
+		mutex_unlock(&psb->path_lock);
+		if (IS_ERR(e)) {
+			err = PTR_ERR(e);
+			goto err_out_free;
+		}
+	}
+
+	mutex_lock(&parent->offset_lock);
+	err = pohmelfs_insert_name(parent, n);
+	mutex_unlock(&parent->offset_lock);
+
+	if (err) {
+		if (err != -EEXIST)
+			goto err_out_remove;
+		kfree(n);
+	}
+
+	return 0;
+
+err_out_remove:
+	if (e) {
+		mutex_lock(&psb->path_lock);
+		pohmelfs_remove_path_entry(psb, e);
+		mutex_unlock(&psb->path_lock);
+	}
+err_out_free:
+	kfree(n);
+err_out_exit:
+	return err;
+}
+
+/*
+ * Create new inode for given parameters (name, inode info, parent).
+ * This does not create object on the server, it will be synced there during writeback.
+ */
+struct pohmelfs_inode *pohmelfs_new_inode(struct pohmelfs_sb *psb,
+		struct pohmelfs_inode *parent, struct qstr *str,
+		struct netfs_inode_info *info, int link)
+{
+	struct inode *new = NULL;
+	struct pohmelfs_inode *npi;
+	int err = -EEXIST;
+
+	dprintk("%s: creating inode: parent: %llu, ino: %llu, str: %p.\n",
+			__func__, (parent)?parent->ino:0, info->ino, str);
+
+	err = -ENOMEM;
+	new = iget_locked(psb->sb, info->ino);
+	if (!new)
+		goto err_out_exit;
+
+	npi = POHMELFS_I(new);
+	npi->ino = info->ino;
+	err = 0;
+
+	if (new->i_state & I_NEW) {
+		dprintk("%s: filling VFS inode: %lu/%llu.\n",
+				__func__, new->i_ino, info->ino);
+		pohmelfs_fill_inode(new, info);
+
+		if (S_ISDIR(info->mode)) {
+			struct qstr s;
+
+			s.name = ".";
+			s.len = 1;
+			s.hash = jhash(s.name, s.len, 0);
+
+			err = pohmelfs_add_dir(psb, npi, npi, &s, info->mode, 0);
+			if (err)
+				goto err_out_put;
+
+			s.name = "..";
+			s.len = 2;
+			s.hash = jhash(s.name, s.len, 0);
+
+			err = pohmelfs_add_dir(psb, npi, (parent)?parent:npi, &s,
+					(parent)?parent->vfs_inode.i_mode:npi->vfs_inode.i_mode, 0);
+			if (err)
+				goto err_out_put;
+		}
+	}
+
+	if (str) {
+		if (parent) {
+			err = pohmelfs_add_dir(psb, parent, npi, str, info->mode, link);
+
+			dprintk("%s: %s inserted name: '%s', new_offset: %llu, ino: %llu, parent: %llu.\n",
+					__func__, (err)?"unsuccessfully":"successfully",
+					str->name, parent->total_len, info->ino, parent->ino);
+
+			if (err && err != -EEXIST)
+				goto err_out_put;
+		} else {
+			mutex_lock(&psb->path_lock);
+			pohmelfs_add_path_entry(psb, npi->ino, npi->ino, str, link, info->mode);
+			mutex_unlock(&psb->path_lock);
+		}
+	}
+
+	if (new->i_state & I_NEW) {
+		if (parent)
+			mark_inode_dirty(&parent->vfs_inode);
+		mark_inode_dirty(new);
+
+#ifdef POHMELFS_CC_GROUP
+		pohmelfs_meta_command(npi, NETFS_JOIN_GROUP, 0, NULL, NULL, 0);
+#endif
+	}
+	unlock_new_inode(new);
+
+	return npi;
+
+err_out_put:
+	printk("%s: putting inode: %p, npi: %p, error: %d.\n", __func__, new, npi, err);
+	iput(new);
+err_out_exit:
+	return ERR_PTR(err);
+}
+
+static int pohmelfs_remote_sync_complete(struct page **pages, unsigned int page_num,
+		void *private, int err)
+{
+	struct pohmelfs_inode *pi = private;
+	struct pohmelfs_sb *psb = POHMELFS_SB(pi->vfs_inode.i_sb);
+
+	dprintk("%s: ino: %llu, err: %d.\n", __func__, pi->ino, err);
+
+	if (err)
+		pi->error = err;
+	wake_up(&psb->wait);
+	pohmelfs_put_inode(pi);
+
+	return err;
+}
+
+/*
+ * Receive directory content from the server.
+ * This should be only done for objects, which were not created locally,
+ * and which were not synced previously.
+ */
+static int pohmelfs_sync_remote_dir(struct pohmelfs_inode *pi)
+{
+	struct inode *inode = &pi->vfs_inode;
+	struct pohmelfs_sb *psb = POHMELFS_SB(inode->i_sb);
+	long ret = msecs_to_jiffies(25000);
+	int err;
+
+	dprintk("%s: dir: %llu, state: %lx: created: %d, remote_synced: %d.\n",
+			__func__, pi->ino, pi->state, test_bit(NETFS_INODE_CREATED, &pi->state),
+			test_bit(NETFS_INODE_REMOTE_SYNCED, &pi->state));
+
+	if (!test_bit(NETFS_INODE_CREATED, &pi->state))
+		return 0;
+
+	if (test_bit(NETFS_INODE_REMOTE_SYNCED, &pi->state))
+		return 0;
+
+	if (!igrab(inode)) {
+		err = -ENOENT;
+		goto err_out_exit;
+	}
+
+	err = pohmelfs_meta_command(pi, NETFS_READDIR, NETFS_TRANS_SINGLE_DST,
+			pohmelfs_remote_sync_complete, pi, 0);
+	if (err)
+		goto err_out_exit;
+
+	pi->error = 0;
+	ret = wait_event_interruptible_timeout(psb->wait,
+			test_bit(NETFS_INODE_REMOTE_SYNCED, &pi->state) || pi->error, ret);
+	dprintk("%s: awake dir: %llu, ret: %ld, err: %d.\n", __func__, pi->ino, ret, pi->error);
+	if (ret <= 0) {
+		err = -ETIMEDOUT;
+		goto err_out_exit;
+	}
+
+	if (pi->error)
+		return pi->error;
+
+	return 0;
+
+err_out_exit:
+	clear_bit(NETFS_INODE_REMOTE_SYNCED, &pi->state);
+
+	return err;
+}
+
+/*
+ * VFS readdir callback. Syncs directory content from server if needed,
+ * and provide info to userspace.
+ */
+static int pohmelfs_readdir(struct file *file, void *dirent, filldir_t filldir)
+{
+	struct inode *inode = file->f_path.dentry->d_inode;
+	struct pohmelfs_inode *pi = POHMELFS_I(inode);
+	struct pohmelfs_name *n;
+	int err = 0, mode;
+	u64 len;
+
+	dprintk("%s: parent: %llu.\n", __func__, pi->ino);
+
+	err = pohmelfs_sync_remote_dir(pi);
+	if (err)
+		return err;
+
+	while (1) {
+		mutex_lock(&pi->offset_lock);
+		n = pohmelfs_search_offset(pi, file->f_pos);
+		if (!n) {
+			mutex_unlock(&pi->offset_lock);
+			err = 0;
+			break;
+		}
+
+		mode = (n->mode >> 12) & 15;
+
+		dprintk("%s: offset: %llu, parent ino: %llu, name: '%s', len: %u, ino: %llu, mode: %o/%o.\n",
+				__func__, file->f_pos, pi->ino, n->data, n->len,
+				n->ino, n->mode, mode);
+
+		len = n->len;
+		err = filldir(dirent, n->data, n->len, file->f_pos, n->ino, mode);
+		mutex_unlock(&pi->offset_lock);
+
+		if (err < 0) {
+			dprintk("%s: err: %d.\n", __func__, err);
+			err = 0;
+			break;
+		}
+
+		file->f_pos += len;
+	}
+
+	return err;
+}
+
+const struct file_operations pohmelfs_dir_fops = {
+	.read = generic_read_dir,
+	.readdir = pohmelfs_readdir,
+};
+
+/*
+ * Lookup single object on server.
+ */
+static int pohmelfs_lookup_single(struct pohmelfs_inode *parent,
+		struct qstr *str, u64 ino)
+{
+	struct pohmelfs_sb *psb = POHMELFS_SB(parent->vfs_inode.i_sb);
+	long ret = msecs_to_jiffies(5000);
+	int err;
+
+	set_bit(NETFS_COMMAND_PENDING, &parent->state);
+	err = pohmelfs_meta_command_data(parent, NETFS_LOOKUP,
+			(char *)str->name, NETFS_TRANS_SINGLE_DST, NULL, NULL, ino);
+	if (err)
+		goto err_out_exit;
+
+	err = 0;
+	ret = wait_event_interruptible_timeout(psb->wait,
+			!test_bit(NETFS_COMMAND_PENDING, &parent->state), ret);
+	if (ret == 0)
+		err = -ETIMEDOUT;
+	else if (signal_pending(current))
+		err = -EINTR;
+
+	if (err)
+		goto err_out_exit;
+
+	return 0;
+
+err_out_exit:
+	clear_bit(NETFS_COMMAND_PENDING, &parent->state);
+
+	printk("%s: failed: parent: %llu, ino: %llu, name: '%s', err: %d.\n",
+			__func__, parent->ino, ino, str->name, err);
+
+	return err;
+}
+
+/*
+ * VFS lookup callback.
+ * We first try to get inode number from local name cache, if we have one,
+ * then inode can be found in inode cache. If there is no inode or no object in
+ * local cache, try to lookup it on server. This only should be done for directories,
+ * which were not created locally, otherwise remote server does not know about dir at all,
+ * so no need to try to know that.
+ */
+struct dentry *pohmelfs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+{
+	struct pohmelfs_inode *parent = POHMELFS_I(dir);
+	struct pohmelfs_name *n;
+	struct inode *inode = NULL;
+	unsigned long ino = 0;
+	int err;
+	struct qstr str = dentry->d_name;
+
+	str.hash = jhash(dentry->d_name.name, dentry->d_name.len, 0);
+
+	mutex_lock(&parent->offset_lock);
+	n = pohmelfs_search_hash(parent, str.hash, str.len);
+	if (n)
+		ino = n->ino;
+	mutex_unlock(&parent->offset_lock);
+
+	dprintk("%s: 1 ino: %lu, inode:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Mon Jul 7, 2:10 pm)
Re: [2/3] POHMELFS: Documentation., Pavel Machek, (Sat Jul 12, 3:01 am)
Re: [2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Sat Jul 12, 3:26 am)
[1/3] POHMELFS: VFS trivial change., Evgeniy Polyakov, (Mon Jul 7, 2:08 pm)
[3/3] POHMELFS: core files., Evgeniy Polyakov, (Mon Jul 7, 2:11 pm)