Showing posts with label cron. Show all posts
Showing posts with label cron. Show all posts

Monday, August 9, 2010

Transferring Files With FTP


Transferring Files With FTP

Abstract:
FTP, with it's mini macro programming language, has been around since the beginning of internet time. It has been used to transfer files around the internet with the avoidance of lower level sockets programming for decades. When transferring files, there is sometimes the question over whether a file has completed it's transfer via FTP on the receiving "ftpd" end. This problem can be mitigate with several best practices so the receiving end can be well aware of when the file being transferred is ready for batch processing through a scheduling facility such as "cron".

Option 1 - Lock Files:
One could ask the initiator create a lock file (send-me.cpio.gz.lock), start sending the data file (send-me.cpio.gz), and then remove the lock file upon completion of the transfer. The cron job can pick it up again once it sees a file where there is no corresponding lock file.

This is helpful for transferring a single file as well as multiple files when that single file was "split" (send-me.cpio.gz.1, send-me.cpio.gz.2, send-me.cpio.gz.3, etc.) Processing for multiple files will not commence until after all the files in the batch have been sent and the lock file is removed.

Option 2 - Suffixes:
A second option is when moving files via FTP, if the sender starts the transfer of the file (send-me.cpio.gz), add a separate suffix to identify that it is in transit (put send-me.cpio.gz.work), and once the file has been sent, the sender should perform a rename of the file in ftp (rename send-me.cpio.gz.work send-me.cpio.gz) The rename is an atomic operation, so cron on the receiving platform can pick up files that do not have a ".work" suffix (or only pick up files which have a ".gz" suffix!)

This option is often very helpful for the occasional transfer of a single large file, where the integrity of the file is important, but people don't want to add too much complexity.

Option 3 - Work Directories:
A third option if one does not want to rename the files, one can always have the initiator place the files in a temporary directory (/temp) and then have the initiator move the file to the production directory (/prod) via their ftp session. The cron job can pick it up only from the production directory since it is known to be completely transferred since the move is an atomic operation.

If there are large numbers of small files which are needed to be transferred, this process is very helpful since occasionally the "inode" may grow aggressively (slowing down the all processing) in the temporary or production directory, requiring an occasional rebuild (rm /temp; mkdir /temp) to resize the inode.

Option 4 - Multiple Files:
A fourth option can deal well with transferring many multiple files (mput) from an initiating system where the receiver wants to process them as they are arriving. If there is a directory holding a large number of files (file1.Z, file2.Z, file3.Z, file4.Z, ...), the initiator can create an additional file with a known suffix (file1.Z.CoMpLeTe, file2.Z.CoMpLeTe, file3.Z.CoMpLeTe, file4.Z.CoMpLeTe, ...), initiate the "mput", and the receiver can have "cron" jobs set up looking for the suffix ("CoMpLeTe"), process the original file name, and upon processing completion, purge the file containing the suffix.

This is especially helpful where transfers may be overlapping from multiple sources with multiple files and the receiving end wants to process the individual files in as close to real-time as possible.

Advanced Automation:
If the senders are newbies to the internet and have worked very little with FTP on the initiating or sending end, there are ways to help them along.

With "ftp", you can build macros on the sending end so the process of logging in, renaming, moving files, creating/removing lock files, or logging out can be reduced to single macro commands, to further remove complexity on the sending end.

The receiver can build the macros and just send them to the people who are the file senders, and the receiver can maintain the ftp macro code, as well. The "ftp" protocol can be used to update those foreign macro files, using a "rename" to swap out the old macro file and an additional "rename" to swap those new macro files into production.

Conclusion:
When there is a need to send files regularly from a source to a destination, the FTP protocol is a good choice when the sender cooperates with the receiver.

Sunday, December 6, 2009

Solaris 10: Measuring Performance Historically

Solaris 10: Measuring Performance Historically

Abstract:
Computing systems have traditionally provided way to metric the health of the system. UNIX System V systems have depended upon "System Activity Reporting" or "sar" tool. The "sar" tools can be set up for automatic collection.

Reporting in Real Time:
The "sar" can be used, without scheduling, to pull data in near-real-time from the kernel by specifying an interval and an average time. One can poll the run queue statistics 5 times on 2 second intervals using "sar" with the "-q" option:
Ultra60/root# sar -q 2 5
SunOS Ultra60 5.10 Generic_141444-09 sun4u 12/06/2009
22:36:04 runq-sz %runocc swpq-sz %swpocc
22:36:06 . . 0.0 . . 0 . . . 0.0 . . . 0
22:36:08 . . 1.0 . . 50 .. . 0.0 . . . 0
22:36:10 . . 0.0 . . 0 . . . 0.0 . . . 0
22:36:12 . . 0.0 . . 0 . . . 0.0 . . . 0
22:36:14 . . 0.0 . . 0 . . . 0.0 . . . 0
Average. . . 1.0 . . 10 .. . 0.0 . . . 0
Scheduling:
Scheduling in Solaris is done using the "crontab" facility. The "cron" daemon wakes up on a regular basis and runs scheduled tasks for individual users. To see the scheduler running, it appears in the process table.
Ultra60/root# ps -elf | grep cron
0 S root 307 1 0 40 20 ? 693 ? 13:44:11 ? 0:00 /usr/sbin/cron
The task lists scheduled by users can be browsed.
Ultra2/root$ cd /var/spool/cron/crontabs
Ultra2/root$ ls -al *
-rw------- 1 root sys. 190 Sep 3 14:22 adm
-r-------- 1 root root 452 Sep 3 14:22 lp
-rw------- 1 root root 531 Dec 6 01:13 root
-rw------- 1 root sys. 308 Sep 3 14:22 sys
-r-------- 1 root sys. 404 Dec 5 06:26 uucp
Scheduling System Activity Reporting:
The "sar" is typically scheduled by the "sys" user. The default is to not run it, by commenting out sample entries.
Ultra2/root$ cd /var/spool/cron/crontabs
Ultra2/root$ cat sys
#ident "@(#)sys 1.5 92/07/14 SMI" /* SVr4.0 1.2 */
#
# The sys crontab should be used to do performance collection. See cron
# and performance manual pages for details on startup.
#
# 0 * * * 0-6 /usr/lib/sa/sa1
# 20,40 8-17 * * 1-5 /usr/lib/sa/sa1
# 5 18 * * 1-5 /usr/lib/sa/sa2 -s 8:00 -e 18:01 -i 1200 -A
The following "sys" "crontab" entry will schedule 15 minute collections of performance metrics.
00,15,30,45 * * * * /usr/lib/sa/sa1
Viewing Scheduling by User:
The correct way to list your scheduling information by user is to use the "cron" with "-l" option.
Ultra60/root# crontab -l sys
#ident "@(#)sys 1.5 92/07/14 SMI" /* SVr4.0 1.2 */
#
# The sys crontab should be used to do performance collection. See cron
# and performance manual pages for details on startup.
#
# 0 * * * 0-6 /usr/lib/sa/sa1
# 20,40 8-17 * * 1-5 /usr/lib/sa/sa1
# 5 18 * * 1-5 /usr/lib/sa/sa2 -s 8:00 -e 18:01 -i 1200 -A
#
00,15,30,45 * * * * /usr/lib/sa/sa1


Historic Data:
The historic data is held in a file system directory. They are stored by numeric day number for a total of one month.
Ultra60/root# cd /var/adm/sa
Ultra60/root# ls -al
total 67068
drwxrwxr-x 2 adm. sys. ....512 Dec 6 00:00 .
drwxrwxr-x 9 root sys. ....512 Dec 5 03:10 ..
-rw-r--r-- 1 sys. sys. 1290144 Dec 1 23:45 sa01
-rw-r--r-- 1 sys. sys. 1177344 Dec 2 23:45 sa02
-rw-r--r-- 1 sys. sys. 1177344 Dec 3 23:45 sa03
-rw-r--r-- 1 sys. sys. 1177344 Dec 4 23:45 sa04
-rw-r--r-- 1 sys. sys. 1177344 Dec 5 23:45 sa05
-rw-r--r-- 1 sys. sys. 1091496 Dec 6 22:00 sa06
-rw-r--r-- 1 root root ..12024 Nov 7 03:15 sa07
-rw-r--r-- 1 sys. sys. .429984 Nov 8 23:45 sa08
-rw-r--r-- 1 sys. sys. 1154304 Nov 9 23:45 sa09
-rw-r--r-- 1 sys. sys. 1154304 Nov 10 23:45 sa10
-rw-r--r-- 1 sys. sys. 1154304 Nov 11 23:45 sa11
-rw-r--r-- 1 sys. sys. 1154304 Nov 12 23:45 sa12
-rw-r--r-- 1 sys. sys. 1154304 Nov 13 23:45 sa13
-rw-r--r-- 1 sys. sys. 1154304 Nov 14 23:45 sa14
-rw-r--r-- 1 sys. sys. 1154304 Nov 15 23:45 sa15
-rw-r--r-- 1 sys. sys. 1154304 Nov 16 23:45 sa16
-rw-r--r-- 1 sys. sys. 1154304 Nov 17 23:45 sa17
-rw-r--r-- 1 sys. sys. 1154304 Nov 18 23:45 sa18
-rw-r--r-- 1 sys. sys. 1154304 Nov 19 23:45 sa19
-rw-r--r-- 1 sys. sys. 1154304 Nov 20 23:45 sa20
-rw-r--r-- 1 sys. sys. 1142280 Nov 21 23:45 sa21
-rw-r--r-- 1 sys. sys. 1173672 Nov 22 23:45 sa22
-rw-r--r-- 1 sys. sys. 1292544 Nov 23 23:45 sa23
-rw-r--r-- 1 sys. sys. 1292544 Nov 24 23:45 sa24
-rw-r--r-- 1 sys. sys. 1292544 Nov 25 23:45 sa25
-rw-r--r-- 1 sys. sys. 1292544 Nov 26 23:45 sa26
-rw-r--r-- 1 sys. sys. 1292544 Nov 27 23:45 sa27
-rw-r--r-- 1 sys. sys. 1292544 Nov 28 23:45 sa28
-rw-r--r-- 1 sys. sys. 1292544 Nov 29 23:45 sa29
-rw-r--r-- 1 sys. sys. 1292544 Nov 30 23:45 sa30
Reviewing Scheduled Data:
There are dozens of reports which can be viewed.

The historic CPU report can be seen with no option or "-u" with "sar", for the same day.
Ultra60/root# sar
SunOS Ultra60 5.10 Generic_141444-09 sun4u 12/06/2009
00:00:00 %usr %sys %wio %idle
00:15:01 0 2 0 98
00:30:00 0 2 0 98
00:45:00 0 2 0 98
01:00:00 0 2 0 98
01:15:00 0 2 0 98
...
21:15:00 0 1 0 99
21:30:01 0 1 0 99
21:45:00 0 1 0 99
22:00:00 0 1 0 99
Average 33 33 0 33
Historic memory usage can also be seen via "sar", using the "-r" flag.
Ultra60/root# sar -r
SunOS Ultra60 5.10 Generic_141444-09 sun4u 12/06/2009
00:00:00 freemem freeswap
00:15:01 163651 10451488
00:30:00 163651 10451488
00:45:00 163651 10451485
01:00:00 163651 10451485
01:15:00 163385 10443984
...
21:00:00 190656 19398416
21:15:00 190656 19398416
21:30:01 190656 19398415
21:45:00 190656 19398416
22:00:00 190656 19398416
Average. 177153 14924952

The "sar" command will also accept a file specifying a historic database, from a previous day in the month.
Ultra60/root# sar -k -f /var/adm/sa/sa02
SunOS Ultra60 5.10 Generic_141444-09 sun4u 12/02/2009
00:00:00 sml_mem. alloc. fail lg_mem... alloc fail ovsz_alloc fail
00:15:00 16646400 12855551 0 118005760 92904032 0 37765120 0
00:30:00 16646400 12858815 0 118005760 92904344 0 37765120 0
00:45:00 16646400 12855735 0 118013952 92900872 0 37765120 0
...
23:00:01 17096960 13002143 0 118767616 93431584 0 37765120 0
23:15:00 17096960 13004855 0 118767616 93433176 0 37765120 0
23:30:00 17096960 13011279 0 118775808 93434304 0 37765120 0
23:45:00 17096960 13005679 0 118775808 93428696 0 37765120 0
Average. 17031338 12977818 0 118666032 93366984 0 37765120 0
There are many other performance data sets which can be extracted once retained automatically from solaris, these are only starting examples.