The last time Hackerfall tried to access this page, it returned a not found error. A cached version of the page is below, or clickhereto continue anyway

Understanding /proc | fredrb

Last week I created a small ps clone in ruby. This was done purely out of curiosity, just wondering how does ps works and how it knows all about current running processes. You can find the project here.

Cool things I learned in the process:

Exploring procfs

At first, I went around the web and read about ps and linux processes. This happened to be the first time I was introduced to procfs reading at TLDP (http://www.tldp.org/LDP/Linux-Filesystem-Hierarchy/html/proc.html) (awesome documentation btw).

In summary, all linuxs processes can be found in /proc folder. This folder is of procfs type which, like I said before, is a virtual filesystem and most of its file descriptors point to in-memory data. This is why if you run a ls /proc -l youll notice that most files and folders are of size 0.

123456
ls -l /procdr-xr-xr-x.  9 root           root                         0 Sep 25 22:10 1dr-xr-xr-x.  9 root           root                         0 Oct  1 10:38 10dr-xr-xr-x.  9 root           root                         0 Oct  1 12:46 101dr-xr-xr-x.  9 root           root                         0 Oct  1 12:46 102...

Inside /proc there is one folder for each process running with its pid as name. So I opened one of the folders to see what I could learn about a running process just by reading these filed.

1234567891011
ls -l /proc/<pid>total 0dr-xr-xr-x. 2 fredrb fredrb 0 Sep 28 23:15 attr-rw-r--r--. 1 root   root   0 Oct  1 10:46 autogroup-r--------. 1 root   root   0 Oct  1 10:46 auxv-r--r--r--. 1 root   root   0 Sep 28 23:15 cgroup--w-------. 1 root   root   0 Oct  1 10:46 clear_refs-r--r--r--. 1 root   root   0 Sep 28 22:41 cmdline-rw-r--r--. 1 root   root   0 Oct  1 10:46 comm-rw-r--r--. 1 root   root   0 Oct  1 10:46 coredump_filter...

Ok, now I have a bunch of files like autogroup, gid_map and maps that I have no idea what theyre for. A good starting point would be checking for their documentation. But why on earth shouldnt I just open them?

So I started looping through the files one by one and most of them were completely unreadable for me, until I ran into the golden pot:

123456789101112131415
cat /proc/<pid>/statusName:	chromeState:	S (sleeping)Tgid:	3054Ngid:	0Pid:	3054PPid:	2934TracerPid:	0Uid:	1000	1000	1000	1000Gid:	1000	1000	1000	1000FDSize:	64Groups:	10 1000 1001VmPeak:	 1305996 kBVmSize:	 1232520 kB...

This is great! Finally something human readable. It contains general data about the process, like its state, memory usage and owner. But is this all I need?

Not satisfied with /proc file exploration, I decided to run ps against strace to see if its accessing any of the files I found.

1
strace -o ./strace_log ps aux

Strace returns all system calls executed by a program. So I filter strace result by open system call and as I suspected (maybe I didnt) the files being open by operating system were the same I first checked:

123456789101112
cat ./strace_log | grep open[...]open("/proc/1/stat", O_RDONLY) = 6open("/proc/1/status", O_RDONLY) = 6[...]open("/proc/2/stat", O_RDONLY) = 6open("/proc/2/status", O_RDONLY) = 6open("/proc/2/cmdline", O_RDONLY) = 6open("/proc/3/stat", O_RDONLY) = 6open("/proc/3/status", O_RDONLY) = 6open("/proc/3/cmdline", O_RDONLY) = 6[...]

Ok, so we have stat, status and cmdline files to check, now all we need to do is to parse this and extract what we need.

The code

The implementation turned out to be fairly simple and it comes down to reading files and display its content in an organized matter.

Process data structure

We want to display our data in a tabular way; where each process is a record on this table. Lets take the following class as one of our table records:

123456789101112131415
class ProcessData  attr_reader :pid  attr_reader :name  attr_reader :user  attr_reader :state  attr_reader :rss  def initialize pid, name, user, state, rss    @pid = pid    @name = parse_name name    @user = user    @state = state    @rss = rss  endend

Finding Pids for running processes

Take into account what we know so far:

So gathering a list of all current pids should be easy:

123456789
def get_current_pids  pids = []  Dir.foreach("/proc") { |d|    if is_process_folder?(d)      pids.push(d)    end  }  return pidsend

In order to be a valid process folder it must fulfill two requirements:

123
def is_process_folder? folder  File.directory?("/proc/#{folder}") and (folder.to_i != 0)end

Extracting process data

Now that we know every pid in the system we should create a method that exposes data from /proc/<pid>/status for any of them.

But first, lets analyze the file.

12345
cat /proc/<pid>/statusName:	chromeState:	S (sleeping)...Uid:	1000	1000	1000	1000

This file is organized in the following way: Key:\t[values]. This means that for every piece of data in this file we can follow this same pattern to extract it. However, some lines will have an individual value and others will have a list of values (like Uid)

12345678910111213141516
def get_process_data pid  proc_data = {}  File.open("/proc/#{pid}/status") { |file|    begin      while line = file.readline        data = line.strip.split("\t")        key = data.delete_at(0).downcase        proc_data[key] = data      end      file.close    rescue EOFError      file.close    end  }  return proc_dataend

The method above results in the following structure:

12345678910
get_process_data 2917=> {"name:"=>["chrome"], "state:"=>["S (sleeping)"], "tgid:"=>["2917"], "ngid:"=>["0"], "pid:"=>["2917"], "ppid:"=>["1"], "tracerpid:"=>["0"], "uid:"=>["1000", "1000", "1000", "1000"], ...

Reading user data

User uid and name association is kept in /etc/passwd file, so in order to show the correct username we must also read this file and parse it.

For the sake of simplicity, lets just read the whole file and save it in a Hash with key as Uid and value as name.

123456789101112131415
def get_users  users = {}  File.open("/etc/passwd", "r") { |file|    begin      while line = file.readline        data = line.strip.split(":")        users[data[2]] = data[0]      end      file.close    rescue EOFError      file.close    end  }  return usersend

Creating process records

So far we have found the pids in the system, read the status file and extracted the data. What we have to do now is to filter and organize this data into a single record that will be presented to the user.

12345678910111213141516171819202122
current_processes = filesystem.get_current_pidscurrent_processes.each { |p|  process = create_process p  puts "#{process.name}\t#{process.user}\t#{process.state}\t#{process.command}\t#{process.rss}"}def create_process pid  data = get_process_data pid  name = data["name:"][0]  user_id = data["uid:"][0]  state = data["state:"][0]  if data["vmrss:"] != nil    rss = data["vmrss:"][0]  end  user = get_users[user_id]  return ProcessData.new(pid, name, user, state, rss)end

The reason why we get VMRss value is because we want to check resident memory values, this means, only whats stored in the physical memory and not whats sitting in our disk.

Extra (formatting)

You can format ProcessData text in a tabular way to get a prettier output.

1234567
format="%6s\t%-15s\t%-10s\t%-10s\t%-10s\n"printf(format, "PID", "NAME", "USER", "STATE", "MEMORY")printf(format, "------", "---------------", "----------", "----------", "----------")current_processes.each { |p|  process = create_process p  printf(format, process.pid, process.name, process.user, process.state, process.rss)}

Result:

123456
PID	NAME           	USER      	STATE     	MEMORY    ------	---------------	----------	----------	----------   1	systemd        	root      	S (sleeping)	    8444 kB   2	kthreadd       	root      	S (sleeping)	             3	ksoftirqd/0    	root      	S (sleeping)	             ...

Conclusion

There is a lot of information that you can find under /proc folder. This post only covers basic data like name, state and resident memory. But if you dig deep into those files you will find a lot more, like memory mapping and CPU usage.

It was very interesting exploring this part of Linux and hopefully you learned something new with this.

Continue reading on fredrb.github.io