tokenizer

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

tokenizer

John Doe-4
Can you help with this function? Could it be improved
somehow?

fun! GetToken(s, n)
    let l:tokenno=0
    let l:start=0
    let l:end=0

    while l:end>=0
        let l:end=stridx(strpart(a:s, l:start), " ")
        if (l:end>0) || (l:end<0)
            let l:tokenno=l:tokenno+1
            if l:tokenno==a:n
                if l:end>0
                    return strpart(a:s, l:start,
l:end)
                else
                    return strpart(a:s, l:start)
                endif
            endif
        endif
        let l:start=l:start+l:end+1
    endwhile
    return -1
endf


       
               
______________________________________________________
Click here to donate to the Hurricane Katrina relief effort.
http://store.yahoo.com/redcross-donate3/
Reply | Threaded
Open this post in threaded view
|

Re: tokenizer

Tim Chase-2
> Can you help with this function? Could it be improved
> somehow?

It looks like you *want* the Nth token in the "s" string passed
in.  To condense it into a bit more of a one liner, I'd try
something like this semi-untested (I tried the substitute with a
fixed string an number, but wrapping it in a function and
substituting the values is the untested part)

fun! GetToken(s, n)
        return substitute(a:s, '\(\S\+\s\+\)\{'.(a:n-1).'}\(\S\+\).*',
'\2', '')
endfunction

It doesn't return -1 if there's no such matching token, and it
has trouble on the first token.  To account for that, you may try
something like

fun! GetToken(s, n)
   if a:n>1
     let l:expr = '^\%(\S\+\s\+\)\{'.(a:n-1).'}\(\S\+\).*'
   else
     let l:expr = '^\s*\(\S*\).*'
   endif
   let l:result = match(a:s, l:expr)
   if l:result > -1
     let l:result = substitute(a:s, l:expr, '\1', '')
   endif
   return l:result
endfunction


This could be condensed into an ugly, hairy one-liner thanks to
the magic of the inline-if (x?y:z) if you wanted to eliminate the
use of variables, but I'll leave that as an exercise to the reader :)

-tim