Reading Through An Rss File To Get Specific Data

PRO

dbeard

USA

Asked Sep 2015 — Edited Nov 2016

Resolved by WBS00001!

Skip to comments Jump to end

I have been able to get and save the attached file to my hard drive. I can read it but I can not figure out how to pull just the data I want. In this case it is a stock quote for VZ. I want to just pull out the symbol and stock price.

I would appreciate any examples of how to accomplish that.

Thank You

Jump to end

Upgrade to ARC Pro

With ARC Pro, your robot is not just a machine; it's your creative partner in the journey of technological exploration.

Compare Pro Features View Subscription Plans

WBS00001

USA

#1 Sep 2015

You can load the file into an EZ_builder variable like this:


$FileText =FileReadAll(filename)

filename would be either a literal string (in quotes) or a variable set to the file name (and path) you wish to read.

Then, with the entire file in the variable $FileText, you can search for what you want by using the IndexOf function:


$StartPos =IndexOf($FileText,&quot;VZ&quot;)

This assumes there is nothing else that starts with the capital letters VZ. Not likely. If it was a possibility you could try a space at the end of the search phrase like this: "VZ " instead of "VZ"

If "VZ" is found, you can then start getting the text from there on by using the SubString( string1, start, length ) function Like this:


if($StartPos &gt;0) #&quot;VZ&quot; was found
  $VZText  = SubString( $FileText, $StartPos, Length($FileText) -$StartPos )
endif

That will put everything from VZ on from the text into the $VZText variable.

But you want to have just the VZ and the price. That will depend on how it is structured. If it is something like this: "VZ closed at $22.55."

then you would do another search in the new $VZText variable looking for the dollar sign:


$StartPos =IndexOf($VZText,&quot;$&quot; )

What you do then depends on how the price is structured. Is there always a period after it or a space? If it's a period, then look for the first period after the dollar sign. To do that you will have to first further reduce the contents of $VZText. Something like this:


$VZText :=SubString($VZText,$StartPos,10)

That will get the dollar sign and the next 9 characters. How many characters it gets isn't all that critical. Just so long as it gets enough to be sure to get the entire price every time.

Then you search for the period:


$EndPos =IndexOf($VZText,&quot;.&quot;)

  #Then you do this:
$FinalQuote =&quot;VZ = &quot;+SubString($VZText,$StartPos,EndPos -$StartPos)

Or However you want to phrase it. You don't need to pull the symbol as such since you already know what it is. Only the price.

You may have to tweak the $StartPos and/or $EndPos variables to get just what you want. Adding 1 or subtracting i as needed.

There are parts above that could be eliminated or combined. They are in this explanation so as to take things step by step.

Let me know if you need further explanation.

dbeard

PRO

USA

#2 Sep 2015

It may be that the file has some weird characters in it, or my coding is bad. But I cant get your example to locate any text. The $startpos never gets greater then 0. I have attached the file. VZ.zip

WBS00001

USA

#3 Sep 2015

Ah, I see. It's an HTML file. I was picturing it as a simple text file with the VZ thing being only one small part of it. VZ is all over this file. I guess in a repeating kind of thing. I've only looked it over in a cursory pass so I don't know yet. You should have gotten something other than 0 from IndexOf though. Assuming you did not do the adding a space thing to it ("VZ " as opposed to "VZ"). It appears there is no instances of "VZ ".

Having said that, the fact it is HTML can be a plus because that format tends to spell out the various parts of what is displayed. Now that I have an example I can give you a better response.

WBS00001

USA

#4 Sep 2015

The script language cannot work with the file because it has quote characters in it. There are bugs in the language which cause operations using strings which already contain quotes to throw errors or simply not work as expected. That is what is happening when I try to perform operations on even small portions of the file by simple copy and paste. If I get rid of the quote characters, it can then be processed properly. There is a function which will get rid of the tab characters (ToLine) as well as, any other unreadable characters, so that's not a problem.

To make it work within the script language the file would first have to be scrubbed of all quote characters. Then it should be able to be read into a variable by a File reading command and processed. The quote characters in the file are not necessary for processing the data to get the stock information from it.

Perhaps whatever you are using to get the data and place it into a file could be modified to get rid of the quote characters in the process? Otherwise you will have to perform a separate process to do that.

I may be able to help you further if I knew the process by which you get the data into a file on your hard drive.

dbeard

PRO

USA

#5 Sep 2015

I am using the following line to get the data and place it on my harddrive.

FileWrite("C:\Users\Public\Documents\VZ.txt",HTTPGet("http://www.nasdaq.com/aspxcontent/NasdaqRSS.aspx?data=quotes&symbol=VZ";))

I found it in another script from another post. It works, gets the file and puts it on my computer, but that's all I can do with it.

Thanks for all your time and effort.

WBS00001

USA

#6 Sep 2015

I see. Interesting. But you could bypass the writing to file part like this:


$RSSText =HTTPGet(&quot;http://www.nasdaq.com/aspxcontent/NasdaqRSS.aspx?data=quotes&amp;symbol=VZ&quot; )

That will put the contents from the web site directly into a variable. Of course, that still leaves us with the same problem concerning the quote characters. There is no way I know of to scrub the quote marks from within the script since it won't work with strings with quote characters in it already. If there were only 2 quote marks, one at the beginning and one at the end, it would be fine, but that's not how it is.

The only way I can think of off hand would be to go with the original command that you posted causing the data to go to a disk file. Then, using the Exec command, call a separate program which, upon starting, will read the file in and scrub all the quote characters, then write the result back to the same file. Then bring that back into ARC by a FileReadAll command. A very circuitous route but one that is workable. I can write a simple program to do that. I'll do that and see how it goes and get back to you. Maybe in the meantime someone can post here with a better solution. Good or bad, at least what I will do will work.

Question though, what do you want to do with the data once it's extracted?

dbeard

PRO

USA

#7 Sep 2015

Pass it to the robot to follow the market. Right now if I try to have it spoken, there is just to much other stuff in the file.

There may be another source for the data.

dbeard

PRO

USA

#8 Sep 2015

Thanks for all the help. I will keep trying to see if there are other options.

dbeard

Reading Through An Rss File To Get Specific Data

Upgrade to ARC Pro

Products

Community

Support

About