| Bithin's Blog

February 18, 2016 · 12:15 am

Bolt: Data Management of Connected Homes

Bolt is a data management system for emerging class of applications that helps IoT devices to interact and store data. The unique requirements of these applications such as support for time-series and tagged data, ability to share data between devices and assurance on data confidentiality & integrity have made the older platforms unsuitable. These platforms such as HomeOS, MiCasa Verde and so on provide high-level abstraction mainly for devices to interact and not for storage. The following paragraph elaborates the data manipulation characteristics of the IoT applications which stand as one of the main reasons for creating bolt.

The observed data manipulation characteristics of the IoT applications are 1) a single writer exit, 2) always generate new data, 3) no random access to it, and 4) retrieve proximate records from the data streams. The traditional databases with support for transactions, concurrency control, and recovery protocols are an overkill for these data and file-based storage offers inadequate query interface as filesystem access happens in sequential order. In addition, data need to be shared between applications and secured while in transit and stored on a storage medium. It should also provide support for policy-based storage that helps minimize cost and efficient utilization of resources. Bolt supports the above data management characteristics, unlike the present storage abstractions. Next, we are going to explain the key techniques used by the bolt to tailor data management for the above applications.

The four main key techniques are chunking, separation of index & data, segmentation, and decentralized access control & signed hash. Chunking is a process of grouping a contiguous sequence of records into chunks. It helps to increase the efficiency of the system by reducing the round trip delay incurred while data access (batching chunks). Data is accessed and stored at the granularity of chunks. Second, separation of index & data help us in two ways 1) index are queried locally 2) trust assumption of the cloud (data stored encrypted in the cloud and decryption happens only at the client side). Third, segmentation is the process of dividing data streams into smaller segments of users defined size. It helps to archive the streams as the amount of data in the stream increases. Finally, bolt use decentralized access control and signed hash to provide confidentiality to data stored at the untrusted cloud storage. It encrypts the data with the owner’s secret key and distribute the keys via a trusted key server. The subsequent paragraph gives an idea about bolt’s implementation.

Bolt API’s allow us to create a data stream which is of two types: ValueStream and FileStream. Former is used for writing small data value such as temperature reading and the latter for larger values like images or videos. The data is added to the stream as a time-tag-value using an append API. A stream consists of two parts – a log of data record (DataLog) and an index that maps a tag to a list of data item identifiers. When a stream is closed, Bolt chunks the segment DataLog, compress and encrypts these chunks and generates a ChunkList. It then uploads the chunks, updated ChunkList and index to the storage server. The chunks are uploaded in parallel and application can configure the maximum number of parallel uploads. Finally, stream’s integrity metadata is uploaded to the metadata server. As mentioned in the previous paragraph, streams are encrypted with a secret key known only to the owner. If the owner wants to give access to other readers, it updates the stream metadata with secret key encrypted with the reader’s public key. In case of reading the data, it first checks the integrity of the metadata with the owner’s public key and the freshness using TTL in-stream metadata before downloading the index and DataLog.

In this paragraph, I am listing few drawbacks of bolt. 1) Fully dependent on control plane, 2) devices unable to subscribe a particular data stream generated from a device 3) each device has its own data stream (missing feature in bolt to merge data stream) 3) prone to pitfalls of the current IoT applications which leverage the cloud for storage (as bolt is using cloud storage), and 4) global scalability will be a challenge as bolt lack location-independent routing of segments. Bolt also uses custom IoT gateways, hence, can lead to interoperability issues.

The performance of bolt was evaluated in two ways: microbenchmark (compared with operating systems read and write: DiskRaw stream operations) and real-world use-cases. In the first approach, they took performance measurements for writes, reads, and scalability. The comparison was done for ValueStream, FileStream, and remote ValueStream. The ValueStream was compared to a single file in DiskRaw; the FileStream with multiple files. The results show ValueStream incurred higher overhead for local writes compared to DiskRaw. For remote streams, 64% of total time was taken to chunk & upload the DataLog; 3% went for index upload. In case of FileStream, its performance is comparable to DiskRaw for local writes. The storage overhead was compared for ValueStream over DiskRaw, it decreases with larger value size. The read performance of the local ValueStream was hindered by the index lookup and a data deserialization. The cost of download dominated for remote reads from ValueStream. The FileStream also have similar performance metrics. The chunking of streams helped to improve the read throughput for temporal range queries. Finally, the time taken to open a stream depends on the time to build the segment index in memory and it grows linearly with the number of segments. The second part of the evaluation is explained in the next paragraph.

They conducted feasibility and performance analysis of bolt with three real world applications such as PreHeat, Digital Neighborhood Watch (DNW), and Energy Data Analytic (EDA). The results were compared with the performance of these applications while using openTSDB. In the first application, the average retrieval time from remote ValueStream decreases with increase in the chunk size. In DNW, chunks improve retrieval time by batching transfers even though it downloads additional data it might not require. With respect to EDA application, a proportional increase in retrieval time for both bolt and openTSDB was observed. Bolt outperform openTSDB by an order of magnitude primarily due to the prefetching of data in chunks. The storage overhead of bolt is 3-5x lesser than openTSDB for all the above applications.

The experiments are excellent and show the benefits of bolt data management system. But, we found the following two drawbacks in bolt: 1) comparison between openTSDB and bolt may be incorrect as openTSDB is a relational database ( even though it supports time-series data ), 2) scalability is weakly tested while doing microbenchmark.

To conclude this summary, bolt is a perfect data management system for emerging class of applications which manage the IoT devices at home. It meets all the requirements of these applications which are unavailable on the existing platforms. The experiments carried out in this paper shows that compared to the openTSDB, bolt performs 40 times faster with 3-5x lesser storage overhead. The drawback highlights the challenges that need to be solved in order to deploy bolt in a highly scalable use case.

Leave a comment

Filed under bolt, Distributed Systems, IoT, Operating Systems, Storage

August 3, 2015 · 8:33 pm

The Cloud is Not Enough: Saving IoT from the Cloud

The Internet of Things(IoT) represents a new class of applications which leverages the advantage of the cloud. This has allowed us to collect data from sensors and stream it to the cloud without worrying about the economic viability of storing and processing this data. But the current approach which is used to connect the IoT applications directly to the cloud has many drawbacks. The concerns regarding privacy, security, scalability, latency, bandwidth, availability and durability of data generated by these IoT applications has not been addressed. In order to overcome these drawbacks, a data-centric approach has been adopted to create an abstraction between the IoT applications and the cloud.

The data-centric abstraction is called a Global Data Plane(GDP) which focuses on distribution, preservation, and protection of data. It supports the same application model as cloud while better matching the needs and characteristics of the IoT by utilizing the heterogeneous computing platforms, such as small gateways devices, moderately powerful nodes in the environment and the cloud in a distributed manner. The basic foundation of GDP is the secure single-write log and applications that are build on top of it are interconnected through log streams rather than by addressing devices or services via IP.

The data generated by the IoT devices are represented as logs, also called as single-writer time series logs. This log is append only; mostly read-only and can be securely replicated and validated through cryptographic hashes. The log-based approach deals with the issues of flexibility, access control, authenticity, integrity, encryption, durability and replication of data. These logs also need to be stored onto the infrastructure, current storage approach on the cloud doesn’t offer flexible placement, low latency or durability of information. To enable these this paper introduce Location-independent Routing in which packets are routed through an overlay network that uses Distributed Hash Table(DHT) technology. Dynamic topology change, pub/sub, and the multicast tree can be built over these overlay network in order to optimize latency and network bandwidth. Although GDP can provide most of the functionality that are needed for applications, some applications may need additional support which can be provided by Common Access API(CAAPI). CAAPI is a layer above the GDP layer and plays a major role in replaying logs when a service fail. Checkpointing techniques can be used to avoid the overhead incurred due to log reply.

The data-centric approach used in this paper has help to overcome the pitfalls of today’s IoT applications. Though these problems are prevalent in web applications; when it comes to IoT space, it becomes more complex. I have written this writeup based on the paper “The Cloud is Not Enough: Saving IoT from the cloud”[1].

[1]:https://www.usenix.org/conference/hotcloud15/workshop-program/presentation/zhang

Leave a comment

Filed under Distributed Systems, Global Data Plane, IoT, Storage

April 28, 2015 · 5:23 am

Making a 64-bit Operating System from Scratch

A computer Operating System does lot of complex tasks like process management, memory management, I/O management etc and building one of its kind is the hardest task for a Computer scientist. But it will be always exciting to experience the process of building an Operating System. This blog post will walk through the steps in making the operating system.

You need to have a cross compiler install on the machine in which you are going to build the new operating system. I have explain this process in my blog “Building a cross compiler“.

First we have to create an assembly file boot.s, which will setup virtual addressing and long mode,

boot.s

[BITS 32]
[SECTION .mbhdr]
[EXTERN _loadStart]
[EXTERN _loadEnd]
[EXTERN _bssEnd]
 
ALIGN 8
MbHdr:
 ; Magic
 DD 0xE85250D6
 ; Architecture
 DD 0
 ; Length
 DD HdrEnd - MbHdr
 ; Checksum
 DD -(0xE85250D6 + 0 + (HdrEnd - MbHdr))
 
 ;
 ; Tags
 ;
 
 ; Sections override
 DW 2, 0
 DD 24
 DD MbHdr
 DD _loadStart
 DD _loadEnd
 DD _bssEnd
 
 ; Entry point override
 DW 3, 0
 DD 12
 DD EntryPoint
 DD 0 ; align next tag to 8 byte boundary
 
 ; End Of Tags
 DW 0, 0
 DD 8
 
 ; Hdr End Mark
HdrEnd:
[SECTION .boot]
[GLOBAL EntryPoint]
[EXTERN Stack]
EntryPoint:
 mov eax, Gdtr1
 lgdt [eax]
 
 push 0x08
 push .GdtReady
 retf
 
.GdtReady:
 mov eax, 0x10
 mov ds, ax
 mov ss, ax
 mov esp, Stack
 
 call SetupPagingAndLongMode
 
 mov eax, Gdtr2
 lgdt [Gdtr2]
 
 push 0x08
 push .Gdt2Ready
 retf
 
[BITS 64]
[EXTERN main]
.Gdt2Ready:
 mov eax, 0x10
 mov ds, ax
 mov es, ax
 mov ss, ax
 
 mov rsp, Stack + 0xFFFFFFFF80000000
 
 ; If you later decide to unmap the lower zone, you will have an invalid Gdt if you're still using Gdtr2
 mov rax, Gdtr3
 lgdt [rax]
 
 mov rax, main
 call rax
 cli
 jmp $
 
[BITS 32]
[EXTERN Pml4]
[EXTERN Pdpt]
[EXTERN Pd]
SetupPagingAndLongMode:
 mov eax, Pdpt
 or eax, 1
 mov [Pml4], eax
 mov [Pml4 + 0xFF8], eax
 
 mov eax, Pd
 or eax, 1
 mov [Pdpt], eax
 mov [Pdpt + 0xFF0], eax
 
 mov dword [Pd], 0x000083
 mov dword [Pd + 8], 0x200083
 mov dword [Pd + 16], 0x400083
 mov dword [Pd + 24], 0x600083
 
 ; Load CR3 with PML4
 mov eax, Pml4
 mov cr3, eax
 
 ; Enable PAE
 mov eax, cr4
 or eax, 1 << 5
 mov cr4, eax
 
 ; Enable Long Mode in the MSR
 mov ecx, 0xC0000080
 rdmsr
 or eax, 1 << 8
 wrmsr
 
 ; Enable Paging
 mov eax, cr0
 or eax, 1 << 31
 mov cr0, eax
 
 ret
 
TmpGdt:
 DQ 0x0000000000000000
 DQ 0x00CF9A000000FFFF
 DQ 0x00CF92000000FFFF
 DQ 0x0000000000000000
 DQ 0x00A09A0000000000
 DQ 0x00A0920000000000
 
Gdtr1:
 DW 23
 DD TmpGdt
 
Gdtr2:
 DW 23
 DD TmpGdt + 24
 DD 0
 
Gdtr3:
 DW 23
 DQ TmpGdt + 24 + 0xFFFFFFFF80000000

To combine all the object files into a single executable we need a linker.ld file,

linker.ld

ENTRY(EntryPoint)
VIRT_BASE = 0xFFFFFFFF80000000;
SECTIONS
{    . = 0x100000;
     .boot :
     {
         *(.mbhdr)
         _loadStart = .;
         *(.boot)
         . = ALIGN(4096);
         Pml4 = .;
         . += 0x1000;
         Pdpt = .;
         . += 0x1000;
         Pd = .;
         . += 0x1000;
         . += 0x8000;
         Stack = .;
     }
     . += VIRT_BASE;
     .text ALIGN(0x1000) : AT(ADDR(.text) - VIRT_BASE)
     {
         *(.text)
         *(.gnu.linkonce.t*)
     }
 
     .data ALIGN(0x1000) : AT(ADDR(.data) - VIRT_BASE)
     {
         *(.data)
         *(.gnu.linkonce.d*)
     }
 
     .rodata ALIGN(0x1000) : AT(ADDR(.rodata) - VIRT_BASE)
     {
         *(.rodata*)
         *(.gnu.linkonce.r*)
     }
 
     _loadEnd = . - VIRT_BASE;
 
     .bss ALIGN(0x1000) : AT(ADDR(.bss) - VIRT_BASE)
     {
          *(COMMON)
          *(.bss)
          *(.gnu.linkonce.b*)
     }
 
     _bssEnd = . - VIRT_BASE;
 
     /DISCARD/ :
    {
       *(.comment)
       *(.eh_frame)
    }
}

Next step will be to create a main.c file which will print “hello world” on to the screen.

main.c

#include 
#include 
#include 

static const uint8_t COLOR_BLACK = 0;
static const uint8_t COLOR_BLUE = 1;
static const uint8_t COLOR_GREEN = 2;
static const uint8_t COLOR_CYAN = 3;
static const uint8_t COLOR_RED = 4;
static const uint8_t COLOR_MAGENTA = 5;
static const uint8_t COLOR_BROWN = 6;
static const uint8_t COLOR_LIGHT_GREY = 7;
static const uint8_t COLOR_DARK_GREY = 8;
static const uint8_t COLOR_LIGHT_BLUE = 9;
static const uint8_t COLOR_LIGHT_GREEN = 10;
static const uint8_t COLOR_LIGHT_CYAN = 11;
static const uint8_t COLOR_LIGHT_RED = 12;
static const uint8_t COLOR_LIGHT_MAGENTA = 13;
static const uint8_t COLOR_LIGHT_BROWN = 14;
static const uint8_t COLOR_WHITE = 15;

uint8_t make_color(uint8_t fg, uint8_t bg)
{
       return fg | bg << 4;
}

uint16_t make_vgaentry(char c, uint8_t color)
{
       uint16_t c16 = c;
       uint16_t color16 = color;
       return c16 | color16 << 8;
}

size_t strlen(const char* str)
{
       size_t ret = 0;
       while ( str[ret] != 0 )
             ret++;
       return ret;
}

static const size_t VGA_WIDTH = 80;
static const size_t VGA_HEIGHT = 24;

size_t terminal_row;
size_t terminal_column;
uint8_t terminal_color;
uint16_t* terminal_buffer;

void terminal_initialize()
{
       terminal_row = 0;
       terminal_column = 0;
       terminal_color = make_color(COLOR_LIGHT_GREY, COLOR_BLACK);
       terminal_buffer = (uint16_t*) 0xB8000;
       size_t y, x;
       for ( y = 0; y < VGA_HEIGHT; y++ )
            for ( x = 0; x < VGA_WIDTH; x++ )
            {
                   const size_t index = y * VGA_WIDTH + x;
                   terminal_buffer[index] = make_vgaentry(' ', terminal_color);
            }
} 

void terminal_setcolor(uint8_t color) 
{ 
        terminal_color = color; 
} 

void terminal_putentryat(char c, uint8_t color, size_t x, size_t y) 
{ 
        const size_t index = y * VGA_WIDTH + x; 
        terminal_buffer[index] = make_vgaentry(c, color); 
} 

void terminal_putchar(char c) { 
        terminal_putentryat(c, terminal_color, terminal_column, terminal_row); 
        if ( ++terminal_column == VGA_WIDTH ) 
        { 
             terminal_column = 0; 
             if ( ++terminal_row == VGA_HEIGHT ) 
             { 
                   terminal_row = 0; 
             } 
        }
} 
void terminal_writestring(const char* data) 
{ 
       size_t datalen = strlen(data); 
       size_t i; 
       for ( i = 0; i < datalen; i++ ) 
       terminal_putchar(data[i]); 
} 
int main() 
{ 
       terminal_initialize(); 
       terminal_writestring("Hello, kernel World!\n"); 
}

The final step before build the new operating system is to create a “Makefile“

Makefile

ISO := os.iso
OUTPUT := kernel.sys

OBJS := boot.o main.o

all: $(ISO)

$(ISO): $(OUTPUT)
        cp $(OUTPUT) iso/boot
        grub-mkrescue -o $@ iso

$(OUTPUT): $(OBJS) linker.ld
        ld -nodefaultlibs -Tlinker.ld -o $@ $(OBJS)

.s.o:
        nasm -felf64 $< -o $@

.c.o:
        gcc -m64 -mcmodel=kernel -ffreestanding -nostdlib -mno-red-zone -c $< -o $@

clean:
        @rm -f $(OBJS) $(OUTPUT)

The final stage involves creating the bootstrap script, which will be now locate at “iso/boot/grub/“. The file grub.cfg will contain the following code,

menuentry "HelloOS" {
    multiboot2 /boot/kernel.sys
    boot
}

Now, from the directory in which the code resides, run the command,

make

This command will compile and create os.iso file.

In order to test the OS, you need to have a virtual machine, it can be either Qemu, VirtualBox or
VMPlayer

I have tested it in qemu,

qemu-system-x86_64 os.iso

This will print “Hello World” on the screen.

Leave a comment

Filed under Operating Systems

Tagged as making, operating system, x86_64

October 4, 2014 · 12:55 am

DebUstav ’14

Debutsav ’14 is a version of the Debian’s developer conference conducted by Debian and Amrita University. Debian conducts DebConfs yearly where developers of the Debian operating system meet to discuss issues around the further development of the system. Debutsav ’14 is a mini version of the Debconf and it is gonna be held at Amrita University, Kollam. Debian is a free operating system, developed by a group of individuals known as the Debian project. Debian is one of the most popular Linux distributions for personal computers and network servers, and has been used as a base for several other Linux distributions. It is a distribution which is famed for its stability and which has some of the widest array of software in its archive serving almost all the Computer system hardware architectures known to mankind. DebUstav Log

DebUtsav is an amazing platform for any student who is passionate to experience the magic of open source. It is a great opportunity for developers, contributors and other interested people to meet the veterans in open source software development, develop professional connections, chat with contributors who can guide you, present in front of a grand audience, and also comprehend the essence of open source development. This conference will be beneficial for developing key components of the Debian system, infrastructure and community. There are lots of free softwares which are not yet packaged in Debian, and the Debutsav might become a platform where the speakers act as a catalyst to have more interesting softwares come into Debian or to work together and try to solve the issues, or if not solve it then at least guide people to what could be the way forward.

The Debutsav ’14 is accepting proposals for talks, hands-on-sessions or workshops. The registrations will be open till the 7th of October. The topic/theme can be anything related to Free and Open Software. This is a grand rostrum for the students, speakers and delegates to talk, share, learn, discuss, debate and do software development. It is indeed a significant opportunity that will give you a strong foundation to build on! This great event is to be held on 17th and 18th of October.

So what are you waiting for? Register for being a speaker at the Debutsav ’14. Click here to register and be proud to be a part of it..

1 Comment

Filed under Uncategorized

August 18, 2014 · 4:47 am

Building a cross compiler

Building a cross compiler is not a hard task, but still if you are not doing it in the right way it can waste lot of your time. This blog post is based on the steps I had followed in my machine with Ubuntu 14.04 installed on it.

First, make sure that you keep all the source in a directory “$HOME/src”.

mkdir $HOME/src
cd $HOME/src

wget http://ftp.gnu.org/gnu/binutils/binutils-2.24.tar.gz

tar xvf binutils-2.24.tar.gz

wget http://ftp.gnu.org/gnu/gcc/gcc-4.9.1/gcc-4.9.1.tar.gz

tar xvf gcc-4.9.1.tar.gz

You can change the version number according to the latest one. The version number can be obtained from the gcc website.

Install few supporting packages before you start the build process,

sudo apt-get install libmpc-dev

sudo apt-get install libcloog-isl-dev

sudo apt-get install libisl-dev

sudo apt-get install libmpfr-dev

sudo apt-get install libgmp3-dev

Hope these packages are also in other linux distros.

Now we need to decide on where to install our new compiler, it is dangerous to install it in the system directory. So we can create a directory $HOME/opt/cross. If you want it to be global “/usr/local/cross” is the ideal location for placing the compiler.

Preparing to compile by setting the correct PATH variable, this can either executed as a shell command or it can be include in the ~/.bashrc file,

export PREFIX="$HOME/opt/cross"
export TARGET=x86_64-elf 
export PATH="$PREFIX/bin:$PATH"

Taking the first step by building Binutils,

cd $HOME/src 
mkdir build-binutils 
cd build-binutils 
../binutils-2.14/configure --target=$TARGET --prefix="$PREFIX" --disable-nls --disable-werror 
make -j12 
sudo make -j12 install

Building GCC,

mkdir build-gcc
cd build-gcc
../gcc-4.9.1/configure --target=$TARGET --prefix="$PREFIX" --disable-nls --enable-languages=c,c++ --without-headers
make -j12 all-gcc
make -j12 all-target-libgcc
sudo make install-gcc
sudo make install-target-libgcc

Now you have a new compiler which does not have access to C library or C runtime.

Leave a comment

Filed under linux, Operating Systems

Tagged as compiler, cross-compiler, gcc, kernel, Operating Systems

December 23, 2013 · 2:54 am

Evolution of Virtual Memory

Before the development of higher level programming language, the programmers had to implement storage allocation methods into his application. The application was divided into different overlays and loaded into the memory one at a time as the size of the memory was confined. This methodology was easy as the programmers were aware of the machine details and his application. But things started changing with the launch of higher level programming language and growing size of application, which led to unmanageable overlays. To overcome this problem, Fotheringham in 1961 come up with the solution of “Dynamic Memory Allocation” in which the task of memory allocation was done by the software and hardware. After the implementation of Dynamic Memory Allocation, the programmer can focus more on problem solving rather than spending time on understanding the hardware details. It also gives the illusion to the programmers that they have vast amount of memory for their use.

Now, let us see how this concept was implemented in Atlas computers,

The main memory in Atlas is divided into core store and drum store. The processor has immediate access to the core store, but it is limited in size as a result some data need to be stored in the drum store. The process of swapping data between the core store and drum store is done by SUPERVISOR, a housekeeping program which resides in memory. The memory contains 1 million 48-bit words and its registers are grouped into units of 512, but it is to be noticed that the 512-word block of information and 512-word unit of memory are different. The 512-word unit of memory in the core store is called ‘page’, and it can hold a block of information. Associate with each ‘page’ there is a 11-bit page address register which stores the number of the block.

The “address” of a word in the main memory is 20-bit long, and consist of 11-bit block address and 9-bit position address within the block. The concept of “address” is distinct from the actual physical location, this distinction is a core idea in this system. When a block needs to be loaded into the memory it is assigned an address within the 20-bit range and this address is notified to the SUPERVISOR. While the block is brought into the core store, SUPERVISOR arranges for that area of core store to have appropriate addresses. It also keeps track of blocks in the drum with the help of a table. During execution, if the program is unable to find a block in the core store, it sends interrupt to the SUPERVISOR which loads the required block into memory. In order to make room for this block, a block in the core store need to be written back to the drum. Drum Transfer Learning Program does this job of selecting which block need to be written back.

“Dynamic Storage Allocation in Atlas Computers, Including an automatic use of backing store” is the first paper on “Virtual Memory” concept in operating systems. A significant work on Virtual Memory was done by Peter J. Denning at Princeton University, his paper “Virtual Memory” extends the work done in this paper.

Leave a comment

Filed under Operating Systems

Tagged as core store, Dynamic Memory Allocation, memory allocation, Operating Systems, Virtual Memory

February 24, 2013 · 9:51 am

BitTorrent Ecosystem

I have been hearing about torrents for long and the only thing that I knew about it was that we can download movies from it. Rather than just downloading movies from torrents, have any one tried to understand how it works? Have any one thought how complex it is? Do you know that BitTorrent generates more that 40% of internet traffic?

BitTorrent had its beginning from a university where student had developed it to share resource among themselves. They never knew that it will make such a huge impact on the internet. Though it is a major technical discovery there have been wide spread protest against it from the movie industry to shutdown public torrent discovery site like utorrent, pirate bay etc to protect their interest.

Let me dive to into the BitTorrent Ecosystem. The Ecosystem consist of three main component,

a) Tracker: Computer which helps the peers to find each other to form a torrent.

b) Torrent Discovery Sites: The place where people can find and upload torrent file.

c) Peers: The computers which forms a torrent.

“The collection of peers that participate in distribution of a specific file at a given time is called as a torrent.”

The peers in a torrent can be classified into two groups, seeders and leechers. Seeders are the peers which contains the file completely and leechers contain the file partially. The files are distributed as chunks and the bittorrent client keep track of chunks which are downloaded into the peers. It assures that the complete file is downloaded from the peers in the torrent with integrity.

If a file need to be shared, it will be seeded locally by the peer which starts the torrent. The .torrent file which is yield after the seeding process can be uploaded into a torrent discovery site. Those who need that particular resource can download the torrent file and open it in the bit-torrent client they have installed locally. The torrent file contain the metadata regarding the file which need to be distributed and the network location of trackers.

Peer A create a torrent to distribute the file movie.avi by first seeding that file locally and then registering itself with a tracker, it creates a movie.torrent file which it uploads into the torrent discovery site. If the peer B what to join this torrent, it downloads the movie.torrent file from the torrent discovery site and with the help of the bittorrent client which extract the details of the torrent from the movie.torrent file joins the torrent by contacting the tracker.

This peer-to-peer protocol has helped in the distribution of file with efficient use of bandwidth and resources. But there has been various privacy and security concerns in the bittorrent ecosystem which need to be addressed.

Leave a comment

Filed under General, web

Tagged as bittorrent, distributed system, peer-to-peer, research, technology, torrent

December 9, 2012 · 4:50 pm

InCTF 2013

Amrita University & Amrita Centre for Cyber Security

Proudly present

InCTF ’13

National Level “Capture The Flag” style hacking contest

Not a day passes when several machines are compromised and infections spread rampantly in the world today. The cyber world has witnessed several dangerous attacks including the Stuxnet virus and it’s successor Duqu. Other recent attacks include the Flame malware, which managed to disguise itself as a legitimate Windows software. It exploited a bug in Windows to obtain a certificate which allowed itself to authenticate itself as genuine Windows software. Other notable examples include rise of botnets such as the highly resilient Zeus banking trojan and the Conficker worm. There have also been instances of espionage by government agencies on one another such as the recent incident where Georgia CERT discovered a Russian hacker spying on them.

Indian websites offer little or no resistance to such security incidents. The Computer Emergency Response Team, India(Cert-In) has been tracking defacements of Indian websites amongst other security incidents. Their monthly and annual bulletins detail the various vulnerabilities and malware infections in various Indian websites. It’s really sad that with so much talent and skill, Indian websites are compromised frequently and nothing can be done to stand this wave of attacks on them.

InCTF is a Capture the Flag style ethical hacking contest, a strategic war-game designed to mimic the real world security challenges. Software developers in India have little exposure to secure coding practices and the effects of not adopting such practices-one of the main reasons why systems are compromised quite easily these. Following such simple practices can help prevent such incidents. InCTF ‘13 is from December 2012 to April 2013 and is focused exclusively on the student community. You can participate from your own university and no travel is required. No prior exposure or experience in cyber security needed to participate.

What you need to do? 1. Form a team (minimum three and maximum five members from your college) 2. Approach a faculty/mentor and request him/her to mentor your team 3. Register online at http://portal.inctf.in

Great Rewards

20K	The winning team receives a cash prize of up to Rs. 20000/-
15K	The first runner-up team receives a cash prize of up to Rs. 15000/-
10K	The second runner-up team receives a cash prize of up to Rs. 10000/-

See http://inctf.in/prizes for more.
Note

Teams are awarded prizes based on their performance
Deserving teams are well awarded. Exciting prizes to be won.

So, what are you waiting for? It’s simple: Register, Learn, Hack!

2 Comments

Filed under Uncategorized

Tagged as ctf, inctf

April 28, 2012 · 10:32 am

My Google Summer Of Code 2012 proposal

Project name: Re-purpose the proposals module
Mentor: Oscar Carballal
Student(s): Bithin A
Country (or Timezone) : kollam, GMT+5.30
E-Mail address:
IRC nickname (freenode.org) : bithin
GitHub/Gitorious repository : https://github.com/bithin
Knowledge Required: Django, e-cidadania internals
Objective to accomplish:

In a democratic system it is important to ensure full participation of citizens. e-cidadania is an open-source e-democracy web platform which strives for obtaining full participation of citizens using various participative processes such as debates, assemblies, budgets, etc. useful for various associations, companies, and administrations. The e-cidadania objective emphasizes on user friendly interface development that brings more participation from the citizens. The various events in participative processes are proposals, debates, voting, etc. Each of these events will be inside a space where the participative process takes place. The purpose of this project is to re-purpose the proposal module which will help the administrators and citizen in the participation process to link with other events in order to increase the acceptance level of a decision for a proposal with better user friendliness. Following are the objective that needs to be accomplished to for the new proposal module:

1) Decoupling the proposal module from the core

The current proposal module is dependent on other modules in e-cidadania. Implementing Generic relation to the proposal model will allow it to be coupled or decoupled from any debate or any other modules of e-cidadania. It will be similar to the comment framework in Django which allows the comment to be attached to any other models.

2) Dynamic proposals form

This will allow the space administrations to create proposal forms as per the need of the proposal. The proposal may vary from process to process so it is one of the important features that need to be integrated into the proposal module. The administrator will be able to select required form fields from a pool to design the proposal forms. The different forms created will be listed to the users.

3) Grouping proposals

Creating proposal circle which will allow the space administrators to group similar proposals. This will allow better manageability of the proposals submitted by citizen.

4) Geo-location feature

Integrating OpenStreetMap into the proposal module. The location which is specified in the proposal will be displayed on to a map. The place will be highlighted when it is clicked on it. This will help the administrators to better understand about the place to which the proposal is related.

Work previous to GSoC: I have reported and fixed many bugs in e-cidadania. More details can be found https://github.com/cidadania/e-cidadania/commits/master?author=bithin

Will you be full time on Google Summer of Code?

Yes, I will be working full time on Google Summer of Code.

Do you have any other obligation that prevent you from working for GsoC?

I will be participating in a contest towards the end of May and will be leaving to Russia on May 27 and back on Jun 5 . I have to got academic commitments till May 9.

How many hours per week you will work on the GsoC?

40 hours

Work timeline during the GSoC

Until May 1st	Understand in depth the e-cidadania internal architecture.
May 1st – 10th	The community bond period. Hang around the e-cidadania code base and IRC. Have to discuss with Oscar Carballal Prego about weekly code review and discussions
May 10 – 20th	Start working on the design of proposal module to decouple it from the core and finalize it through review and discussions with Oscar Carballal.
May 21st – 31st	Working on the design of Dynamic forms. Understanding the OpenStreetMap and to learn how it can be integrated to the Django application.
June 1st – 15th	Implementing the newly designed proposal module and testing its compatibility with other modules. Documenting the new design and its implementation.
June 15th – 30th	Implementing dynamic proposal form. Integrating it to admin dashboard the options for creating proposal forms. Get the existing code and documentation for the mid term review.
July 1st – 10th	Implementing proposal circle feature to group similar proposals.
July 11th – 30th	Integration the OpenStreetMap to the proposal form.
August	Testing of the modules for bug and fix them. Clean up the code. Complete the pending documentation and get ready to integrate the code to the repository.

About Me

I am a final year student pursuing masters in computer science from Amrita University, Amritapuri campus, India. Ever since I have heard about the Internet it has fascinated me and I started learning various technologies that drives it. All these years in my effort to know more about Internet I have mastered many web technologies and networking skills. As always to apply theory through practise I wish to use my skills to contribute to open source project. e-cidadania being a web application developed using Python-Django and which helps the society was my obvious choice. I have got good experience in developing and deploying Django web application framework applications. I was involved in developing contest portal in Django for InCTF ( Indian Capture the flag ) national level ethical hacking contest.

My past open source contribution involves contributing to e-cidadania, Python, phpSec( php security library). I am an active developer for e-cidadania in which I have reported and fixed many bugs. This has given me a sound knowledge into e-cidadania code base. I am also a core member at foss@amrita and mentor junior students in contributing to open source. Participating in various national and international level hacking contest has given me good problem solving skill and to quickly adapt to code base of applications build in different programming language.

I think e-cidadania is a project which will greatly benefit the society and I am very happy to be part of the community. I would like to contribute to e-cidadania even after Summer Of Code and I take the responsibility for the maintenance of any code contributed by me.

5 Comments

Filed under College, General, web

Tagged as environment, research, software-development

February 26, 2012 · 10:36 am

Socket programming using PHP

This post is for those people who have done a course in computer networks or any course related to computer network or people who have interest in networking and love to program in PHP.

The knowledge many of you would have learning socket programming is using C. It sure that many of the high level implementation can only be done using C about 95 % of the task can also be done using PHP. PHP programming language is usually used for web application development for which it does a pretty good job. Here I have written a server application using PHP, client can telnet to the server send a message to it. That message will be reflected back to the client. You can have a look at the code,

/******************************

Coded by: Bithin A

Email id: a.bithin@yahoo.com

Date : 26-02-2012

******************************/

<?php

//The server will infinity wait for requests

set_time_limit(0);

//creating a new socket

if (($socks = socket_create(AF_INET, SOCK_STREAM, 0)) < 0 ){

print “Could’t create a socket”; exit(1)

}

//Binding it to an address and port of a server

if ((socket_bind( $socks , ‘127.0.0.1’, 5000)) < 0 ){

print “The socket could bind with the server”;exit(1);

}

//Accepting the packets coming from the connection

if ((socket_listen($socks, 5)) < 0){

print “Not able to listen to the client”;exit(1);

}

if (($socks_connection = socket_accept($socks)) < 0 ){

print “Not able to accept any connection at the movement”;exit(1);

}

$msg = “Welcome to my server”;

if ((socket_write($socks_connection, $msg , strlen($msg))) < 0 ) {

print “Could not write to socket”;exit(1);

}

$buffer = ‘ ‘;

if (($buffer = socket_read($socks_connection, 2048 ) ) < 0 ) {

print “Could ‘t read from the socket”;exit(1);

}

if ((socket_write($socks_connection, “You said : $buffer\n”, strlen(“You said : $buffer\n”))) < 0 ) {

print “Couldn’t write to the socket”;exit(1);

}

before you compile it using php, make sure that you have the socket extension for PHP, else you can install it using apt-get package in debian

sudo apt-get install php-net-socket;

1 Comment

Filed under web

Tagged as socket programming, web application development

Bolt: Data Management of Connected Homes

The Cloud is Not Enough: Saving IoT from the Cloud

Making a 64-bit Operating System from Scratch

DebUstav ’14

Building a cross compiler

Evolution of Virtual Memory

BitTorrent Ecosystem

InCTF 2013

Amrita University & Amrita Centre for Cyber Security

Proudly present

My Google Summer Of Code 2012 proposal

Socket programming using PHP

Archives

Calender

Categories

My Delicious

Tags

cluster map

Blog Stats