[vox-tech] Backup User Permissions

Alex Mandel tech_dev at wildintellect.com
Tue Jan 25 16:45:59 PST 2011


On 01/24/2011 02:43 PM, Kyle Ambroff wrote:
> On Mon, Jan 24, 2011 at 1:27 PM, Alex Mandel <tech_dev at wildintellect.com> wrote:
>> So I'm trying to setup automated remote backup of some files from
>> machine1 to machine2 using something simple like rsync. What I'm having
>> trouble figuring out is what user to run it as and how to get that user
>> the correct permissions.
>>
>> In the example use case I want to copy my Apache logs over to a 2nd
>> machine to run awstats on it without putting much of a load on the
>> actual web server. I was thinking of creating a "backup" user,
>> generating a passphraseless key and then rsync on a cron timer.
>> Should this user be a system user (below 1000) or a regular user (above
>> 1000), since it needs a key I would assume it needs to be a regular user
>> with a home directory?
>>
>> Question 2 is how do I make sure it has permissions to read the logs?
>> It appears that most of /var/log/apache2 files are root:adm but some are
>> root:root. If they were all g+r for adm then just adding my backup user
>> to the adm group should work?
>>
>> Looks like I need to go figure out why some logs have a different group.
> 
> I really don't think using SSH for stuff like this is a good idea.
> It's just too hard to get the security right, especially with a
> passphraseless key. Too scary for me. Just don't do it.
> 
> If your main use case is collecting statistics from your web servers
> then I suggest you look at Ganglia[1]. One of my coworkers has
> released a bunch of really awesome Ganglia tools that we use at Linden
> Lab for monitoring 10k+ servers, many of which are running Apache.
> 
> http://ben.hartshorne.net/ganglia/
> 
> Check out ganglia-logtailer for example. It includes support for
> collecting the following stats from Apache:
> 
>  * Requests per second
>  * Requests per second broken down by HTTP method
>  * Average query processing time
>  * Ninetieth percentile query processing time
>  * Number of 200, 300, 400 and 500 responses per second.
> 
> All of this data ends up on your Ganglia dashboard, along with general
> system health. As an added bonus you can use his ganglios plugin for
> Nagios[2] to set up alerts on any value in Ganglia. This is just
> fantastic once you have it set up. You can set it up to send SMS
> messages or emails if you have a spike in 500 responses, for example.
> Having historical performance data can be a life saver as well.
> 
> -Kyle
> 
> [1] http://ganglia.sourceforge.net/
> [2] http://www.nagios.org/

Interesting idea, but that isn't really the data I'm trying to get. I
already have munin running for general health tracking. The analysis of
the logs is more about who is visiting what, and where are they from, +
inferring how long they stayed on the site. So it really relies on
having the log files as a whole and running through some tools, one of
which is awstats, another is an import to a RBDMs for more exact query
reporting.

Thanks,
Alex


More information about the vox-tech mailing list