Page 1 of 1

Policy on saving NSL data

Posted: Fri Jan 28, 2005 8:17 pm
by RJN
The NSL array of nine 75 Gig SCSI drives keeps filling up. We keep saving data to DVD leaving the FITS files from the last two months, but the drives are too frequently 90% filled or more. What should we do?

1. Bigger disk drives. Switch to large IDE drives. Pricegrabber.com lists external 500G disk drives for under $1. a Gig. For perhaps $5,000, we could increase the NSL data saving capacity by a factor of a few. One problem is that our system administrator and general computer guru is a big fan of SCSI drives and upgrading them is much more expensive.

2. Save less data. Specifically, save less than 2 months of old FITS files, as is done now. This will help a little bit but be annoying for people wishing to see a FITS file from the month before last.

3. Faster DVD burner. The old one we have now was purchased two years ago, and even though we can't find the actual speed, surely a faster one is available today. This will likely cost around $500.

4. Ring buffer. Start automatically deleting FITS and moving GIF data that is older than two months. JPGs can stay, for now. On the down side, this means losing real data and possibly re-defining the work-study job for one of our key undergraduates. On the up side, this solution automates everything and no humans will be needed to intervene on a regular basis.

Any thoughts would be much appreciated!

- RJN

Posted: Fri Jan 28, 2005 8:35 pm
by The Meal
Buy more hard drives! Lots and lots and lots and lots of 'em!!

~Biased Neal

Thoughts on data-saving

Posted: Fri Jan 28, 2005 9:15 pm
by TJ
I wouldn't say I'm against IDE drives, just that whatever system we choose makes sense in terms of scalability and reliability. The problem with IDE isn't the speed or capacity, but rather the limit on number of devices -- namely 4. The system drive takes one spot, leaving 3. I'm not sure the case allows space for 3 more drives, and I'm not aware of an external IDE case/mounting solution for a Sun -- hence the SCSI array.

Ideally, we would look at some sort of "real" disk array -- probably attached to the server via SCSI, but probably running with IDE disks inside. The drives that we have are not really in an array, rather a string of 9 individual disks on a SCSI bus. Real arrays, properly implemented, scale quite well -- either through the addition of drives until all slots are populated, or through the connection of an additional drive chassis.

It's all a matter of balancing cost against requirements. More expensive solutions are often more flexible, while cheaper ones (backup tapes come to mind) don't allow fast access to archived data. I'd recommend keeping these principles in mind, regardless of technical limitations like the number of IDE devices.

Size: how much do you want to keep?
Lifetime: how long do you want to keep it?
Access: how much trouble are you willing to go to in order to look at archived data?
alternately, how immediate should access be? online, nearline, or offline?
Cost: what is it worth to do this?

TJ

Posted: Fri Jan 28, 2005 9:31 pm
by lior
Changing the hardware is a long term solution. On the short term we will probably have to pay on-line data. I was thinking about writing a short program that compresses the older files the archives. In a second thought, the JPGs are already compressed, so I need to check whether the additional disk space (if at all) provided by using gzip worth changing our web site.

Posted: Sat Jan 29, 2005 1:26 am
by Emoticon Fury
Couldnt you load a full tower with serveral DVD burners with dual layer capability? That way you could cram twice as much on one disk and burn more than one disk at a time. Also has anyone looked into file compression technologies like ZIP, ACE or RAR?

Posted: Sat Jan 29, 2005 8:15 pm
by Matt Merlo
I would go with getting a new DVD burner. This way there is no data lost, but a huge amount of money is not needed.

Posted: Sat Jan 29, 2005 9:03 pm
by Vic Muzzin
Can we do SATA?
I have this hard drive from newegg
http://www.newegg.com/app/ViewProductDe ... 59&depa=1I
63 cents per gig.
It may just be my imagination but I believe my SATA drive responds much faster than my IDE drive.
I also recommend this DVD burner.
http://www.newegg.com/app/ViewProductDe ... 962&depa=1
$63.50
Maybe with careful shopping we could employ both solutions, therby best preparing the system for future expansion.

What is the speed of the current burner?

Storage

Posted: Mon Jan 31, 2005 4:52 pm
by nbrosch
I suggest NOT going SCSI, but using the cheaper alternative. We are regularly using 200GB ATA disks and are pretty happy. Note though that we are doubling the data up on two disks, so that a crash does not destroy irreplaceable observation data.
Noah Brosch

Posted: Wed Aug 29, 2007 6:59 pm
by dnabost
With the way technology is growing at an exponential rate, storage is becoming cheaper and cheaper. It's should not cost you an arm and a leg.

I would probably say to go with the dvd burner though.

Disks vs. DVDs

Posted: Wed Aug 29, 2007 7:14 pm
by nbrosch
The disk advantage is the on-line availability of the images and the reduction of hassle in mounting DVDs. One can now have 4x500 GB disks in a single enclosure, for a 2 TB storage (1 TB with redundancy). However, images grew as well. The CONCAM IV now in testing at the Wise Observatory produces ~13MB images and we probably will go to a faster cadence than the lower CONCAM models. This implies a data generation of some 6 GB per night and even a 1 TB storage will fill up in six months...