View Full Version : general, how to convert text data

02-14-2005, 11:40 PM
I'm pretty new to the programming stuff. I know that stuff can be done, but usually not how.

Let's say I want to take a text file like this (http://www.comicimages.com/rawdeal/resources/cardlist/premiere.txt) and somehow break the entries up and convert the data into files that look like this (http://www.baraen.net/Chop.card).

What would be the easiest way to do this? I don't really want to do it by hand, since there's 2000 cards. I've gathered Perl is useful for this kind of thing, but I know nothing about it. I don't mind learning whatever I need to learn to do it, or messing around with regular expressions or whatever.

02-15-2005, 12:00 AM
I was going to make it for you in Python (as a simple program to take my mind off trying to design a site), but I can't figure out how you got the details of the Chop card from the card list. Where do power, sets, cost and colour come from? Programming can do a lot of things, but mind-reading isn't one of them yet...

02-15-2005, 12:48 AM
I didn't expect anyone to do it for me, so I didn't think specific details were needed.

If you really want to:

This is a hack, to get the files into a format that another program (Apprentice 2.0) will accept.

Name should be obvious.

Color is code for the card type, mapped as such:
Maneuver = White
Action = Blue
Reversal = Red
Maneuver / Reversal = Green
Action / Reversal = Black
Maneuver / Action = Gold
Pre-match = Artifact
Mid-match = Colorless
Strike, Grapple, Submission, High Rish, and Trademark Finisher are types of Maneuvers.

Cost is a code for the rarity (I just decided to change this to numbers rather than use letters).
Common = 1
Uncommon = 2
Rare = 3
Starter = 4
Ultra Rare = 5

Sets are the sets the card is in using two-letter abbreviations, comma seperated.
PR = Premiere
S1 = Survior Series One

Type should be obvious.

Power is "F/D", so 0/2 for Chop.

Text should be obvious. Everything between the type and the F/D

Flavor text (the quotes) and the set number are discarded.

02-15-2005, 01:49 AM
Ack! I just realized what a mess this would be to do! I won't do it for you simply because it's far too complex for a free script of no real personal interest to me (perhaps a little blunt, but true...), but I'll recommend Python and regular expressions (module re) for its accomplishment. Pretty much any language would do as almost all of them seem to support regular expressions these days but Python has IDLE :). See the thread "python learning" for my recommended Python IDE and some links for learning.

BTW I tried doing a multi-level regular expressions split & match two times only to realize that there was a higher level to search each time. Add this to my general dislike of string hacking, and you may be able to understand why I got too frustrated to continue coding it.

My apologies for getting your hopes up (if I did),


02-15-2005, 02:34 AM
Nah, I would've expected you to give up if I expected anything.

I'll take a look at Python. I was going to try it out a little someday anyway, since it seems like a "cool" language.

I'll have to weigh the time to learn how to code + code it + the value of learning VS the time it would take to create the files by hand.

If I could get even part of the files done by regex, it would make it less of a chore.