Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 4 of 4
  1. #1
    New to the CF scene
    Join Date
    Nov 2016
    Location
    American Midwest
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Beginner question: Can I do this in Java, and should I?

    I am studying the grammar of an understudied Asian language and need to set up a computer program that can help me do textual analysis. Specifically, I need two basic functions: 1) search and replace all instances in a text file. (This is necessary because I need to standardize my text file for analysis and remove common spelling errors, etc. MS Word's replace tool works great, except that it only does all instances of one term at a time. There are over 100 common errors that I need to be able to replace every time I add more text to my analysis file. I need a tool that I can program with all of these find & replace objects so that I only have to run a single execution each time I update my text file.)
    2) I need a program to search for grammar patterns based on specific data sets. (I would need to define data categories like NOUN, NUMERAL, QUALIFIER, etc. and create a table containing elements for each category. I would then need the program to search for patterns that I would input. For example, I would select NUMERAL + NOUN + QUALIFIER and it would return to me all instances in the text file where elements of these data sets occur in exactly that order of syntax.)

    (1) Is it possible to write such a program in Java? or (2) Would I be better served to look at a different language for these purposes?

    I could possibly do the first of these functions with AutoHotKey if I needed, but I'm not so sure about the second program. I have already started learning Java, partly because I need these tools and partly because I would just like to expand my skill set. I will probably continue learning Java regardless of whether I can use it for this project or not. However, being a thorough amateur at this point, I still have little conception of which programming languages are really ideal for which scenarios.
    Thanks!

  2. #2
    New to the CF scene
    Join Date
    Nov 2016
    Location
    American Midwest
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I hate to ask a question only to withdraw it. I finally stumbled on some new information last night, and I believe that I am going to run the 1st function on a word processor macro and do the 2nd with GATE text mining software.
    I hope to continue learning java regardless and hope to see you all around later.

  3. #3
    New Coder
    Join Date
    Oct 2016
    Location
    Canada
    Posts
    12
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Yes first you clear all concept of core java and jsp then start coding.

  4. #4
    New to the CF scene
    Join Date
    Nov 2016
    Location
    American Midwest
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts
    UPDATE: After playing around with GATE for a couple weeks, I feel like I am beginning to understand the concept of tokenizing text and assigning annotations. Now what I can't find is a way to search by those annotations. (E.g. I can assign the tag NN to dog, cat, and fish and the tag VB to run, swim, and jump; but I don't know how to write a search app. to look for pattern NN + VB in a corpus of text). I think that Lucene probably has, somewhere within it, the tools that I am looking for, but I am completely unfamiliar with Lucene at this point and am not even sure that it is the tool that I need.
    Does anyone know how to create a search tool in Java, esp. one that can search by annotations within text? Any tips?
    Thanks!


 

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •