Re: Copy/move btrfs volume

Previous thread: [PATCH] btrfs: handle errors for FS_IOC_SETFLAGS by Sean Bartell on Wednesday, June 30, 2010 - 5:05 pm. (1 message)

Next thread: dmesg filled with "parent transid verify failed" messages using Linus' HEAD as of 20100704 by Francis GALIEGUE on Sunday, July 4, 2010 - 4:09 am. (1 message)
From: Lubos Kolouch
Date: Thursday, July 1, 2010 - 3:28 am

Hello,

I am testing btrfs on one of our backup servers
(many millions of files, 1.5TB size, running on (non-btrfs-provided-) 
raid5).

I am using subvolumes/snapshots with following rsync.

It works very well, but I would like to ask a question... say I would need 
to copy/move the files to different server/disk.

Normally I would do it with rsync, but I guess it will not preserve the 
subvolumes, it will also not detect that they are the same files (I guess
they are not just normal hardlinks). So I would end up with duplicated 
files.

What is the correct way to do this?

Thank you and best regards

Lubos Kolouch

--

From: Daniel J Blueman
Date: Thursday, July 1, 2010 - 4:26 am

The only way to do this preserving duplication is to use hardlinks
between duplicated files (which reference counts the inode), and use
'rsync -H'.

Dan
-- 
Daniel J Blueman
--

From: Lubos Kolouch
Date: Thursday, July 1, 2010 - 4:33 am

But when the files are on different snaphots, does rsync see them as 
hardlinked?

A scenario - I have raid5 of say, 1TB HDDs. It contains many snapshots.
Then, few years later, new machine is bought and there are, say, 5TB 
discs.

So I need to transfer the btrfs volume to the new machine. 

But how to do it so that it looks the *same*, ie. the same snapshots?
I could of course write a custom script to create the subvolume, rsync 
the files, create snapshot, rsync files, etc,

but it would be nice if the btrfs toolset supports this by default...

Lubos

--

From: Matt Brown
Date: Thursday, July 1, 2010 - 3:21 pm

Hello,

With backed up files consisting of hard links, I usually use dd to copy
the file systems at the block level

# dd if=/dev/sda of=/dev/sdb bs=20M

and then expand the file system. This is because I found that tools like
rsync, while usually fast, are extremely slow when dealing with millions
of hard linked files.


For me, I had to copy over BackupPC hardlinked files from a full disk to
a smaller disk, both using ext4, and I could not use dd. What normally
should have taken an hour, instead took almost a week. (Yes, I wanted to
use btrfs, but it had a hard link limit of 255 - don't know if it still
does.)

It would be nice to have a btrfs command that could rapidly copy over
the file system, snapshots, and all other file system info.

But what benefit would having a native btrfs 'copy/rsync' command have
over the dd/resize option?

Pros
- Files will be immediately checksumed on new disks, but this may not be
as important since a checksum/verify command will be implemented.
- Great 'feature' for copying files to new drives, and keeping
snapshots. Could even be used to export snapshots.
- I believe compressed files will have to be uncompressed and
recompressed, depending on when file is checksummed. (I may be wrong on
this one). This will actually be a con for slow and/or high load machines.
- One command instead of many (dd -> resize -> verify).

Cons
- File system would still have to be unmounted, or at least read-only,
as I doubt the command will have rsync's update or delete abilities.
But, maybe it could.

Questionable
- May be faster than dd/resize, or it may be just as slow as rsync is
with hard links. And I am talking about dozens to thousands of
snapshots, and millions to billions of files.

Matt
--

From: Oystein Viggen
Date: Thursday, July 1, 2010 - 11:15 pm

If you can (temporarily) attach the old and new drives to the same
computer, putting the ext4 BackupPC store on LVM and moving the LV
around might be more convenient, or at least feel more "high level".

For btrfs with lots of snapshots, I believe "btrfs device add" of the
new device followed by "btrfs device remove" of the old one would be the
most convenient.

One advantage of using LVM and btrfs multi device support in this way is
that the actual downtime is minimal -- you can keep the filesystems
online.  Even on cheap hardware, the only downtime should be to
attach/remove disks.

Øystein
-- 
If it ain't broke, don't break it.

--

From: Lubos Kolouch
Date: Saturday, July 3, 2010 - 12:33 am

This solution if very elegant and cool - if you can put the discs into one 
computer.

It does not help too much to copy the files over network and preserve the 
snapshots... or can you add like this a network-attached device (sshfs) ?

Lubos

--

From: Hubert Kario
Date: Wednesday, July 21, 2010 - 8:00 am

You could also go the totally cool option (albeit a bit creazy) and use 
network block devices and have no downtime...

The overall process will take more time though.

-- 
Hubert Kario
QBS - Quality Business Software
ul. Ksawerów 30/85
02-656 Warszawa
POLAND
tel. +48 (22) 646-61-51, 646-74-24
fax +48 (22) 646-61-50
--

From: Chris Mason
Date: Thursday, July 1, 2010 - 6:29 pm

This is definitely something I'm looking to add.  The btrfs-progs git
tree has some code that allows userland to walk the btrees and detect
the duplicate files.  But this is just a building block needed for the
full backup program.

Instead of hard links, it is possible to use reflinks with cp, which
uses the cloning ioctl.

-chris
--

Previous thread: [PATCH] btrfs: handle errors for FS_IOC_SETFLAGS by Sean Bartell on Wednesday, June 30, 2010 - 5:05 pm. (1 message)

Next thread: dmesg filled with "parent transid verify failed" messages using Linus' HEAD as of 20100704 by Francis GALIEGUE on Sunday, July 4, 2010 - 4:09 am. (1 message)