scaling to multiple data files

That's correct, those users have been logged in or had processes running 
on this machine for four days. The machines in question are time-sharing 
Linux servers for college students and professors to use. multi-day jobs 
are common.

The "last" command does what you suggest, but it doesn't capture 
processes left running in the background when a user logs out.  This 
data is simpler and collected across Windows and Linux hosts. Sessions 
are somewhat ambiguous. We just care who is running processes on a 
machine at a certain time. We also collect the process name of processes 
that each user is running so that we can gauge how often applications 
are being used, and by whom. For this analysis, I'm not worried about 
which processes are running, only unique users per day. i have almost 
four years of historical data for some machines in this format.

We have multiple tools written in different languages that parse this 
data. I'm writing one that does better graphing.

scaling to multiple data files

Thread (3 messages)