how to find lexicographer id into WorNet's nt file without library - wordnet

I'm trying to link VerbNet with WordNet using the files they provide to work directly with data:
VerbNet =>
http://verbs.colorado.edu/verb-index/vn/verbnet-3.3.tar.gz
WordNet => http://wordnet-rdf.princeton.edu/static/wordnet.nt.gz
The verbs in VerbNet have a link to WordNet through their sense_key:
e.g. live%2:31:00::
This would be the structure of sense_key:
(lemma)%(part_of_speech_number):(lexical_file_number):(lexicographer_id)::
Parsing the n-triples of the nt file, I have found all the data except the lexicographer_id:
lemma => live
part_of_speech_number => 2
lexical_file_number => 31
lexicographer_id => ??

Parsing the wordnet.nt file doesn't seem to give you this information.
If Wordnet 3.1 database is downloaded from http://wordnetcode.princeton.edu/wn3.1.dict.tar.gz (link in https://wordnet.princeton.edu/download/current-version), there you'll find the file "index.sense" which contains entries like these:
bethel%1:06:00:: 02836245 1 0
bethink%2:31:00:: 00685046 2 1
bethink%2:39:00:: 02171205 1 3
bethlehem%1:15:00:: 08813084 2 0
The current description of this structure is on https://wordnet.princeton.edu/documentation/senseidx5wn
The first parameter in the line is the sense_key which is used in VerbNet. The second parameter is the synset_offset which coincides with the Synset Identifier in the file wordnet.nt.
From the file "index.sense" you can get also the sense number to match against the structure "word.pos.sense_number", like in: "man.n.02"

Related

Categorization column based on text contained in 2 other columns within T-SQL query

I'm building a report in Power BI and could setup a Power Query custom column using Text.Contains to solve this problem but the M Code would be very long and I'd rather perform this upstream in the SQL query. I have very little SQL experience.
I'm working with website data from Adobe Analytics. We have our website URLS and web pages grouped into categorical segments based on the product/service the URL/webpage corresponds to. A segment is defined by a list of URL paths and/or web page names, sometimes 1 path/page, sometimes over 30.
My result needs to be the following table:
Page URL Path
Page Name
Page Category
varchar(255)
varchar(255)
varchar(255)
Page URL Path examples:
/careers/starting-your-career/scholarships.html
/services/technology/ecommerce.html
Corresponding Page Name Examples:
Career & Scholarships | Company Name
Digital Transformation | E-Commerce | Company Name
There are a total of 76 page categories/segments to define. This screenshot shows an example of some categories and their definition.
Can anyone help me get started in writing this query?
I tried using CONTAINS but I believe this only works within a WHERE statement and I don't think it can be scaled to the needed extent:
SELECT
post_evar3 as 'Page URL Path',
post_evar4 as 'Page Name',
CASE
WHEN post_evar3 CONTAINS ('/services/assurance' or 'services/audit' or 'insights/financial-reporting')
AND (post_evar3 CONTAINS 'asc-842' OR post_evar4 CONTAINS 'asc 842')
THEN 'Audit Services'
WHEN post_evar3 CONTAINS '/services/strategy-and-management-consulting'
THEN 'Business Stratgegy Operations'
ELSE 'Other'
END AS 'Page Category'
FROM
Marketing.WebAnalytics.WebData
WHERE
exclude_hit = 0
AND hit_source = 1
I've read about Full-Text Search and Index solutions that are over my head in developing and I don't know that this method can be used within the Power BI SQL query environment. I've wondered if I need to declare the definition values into their own table, then join with the WebData table, though defining using both Page URL Path AND Page Name for the same category throws me for a loop.
The M code for this kind of matching is not large, though execution time can can vary
let BufferedTable2=Table.Buffer(Table2),
Source = Table.AddColumn(Table1,"Match",(i)=>try Table.SelectRows( BufferedTable2, each Text.Contains(i[Column1],[Match1], Comparer.OrdinalIgnoreCase) and Text.Contains(i[Column2],[Match2], Comparer.OrdinalIgnoreCase) ) [Return]{0} otherwise null, type text)
in Source

How do I generate numbered lists with pod?

Looking at https://docs.raku.org/language/pod#Lists. I don't see a way to create a numbered list:
one
three
four
Is there an undocumented way to do it?
There is not currently (as of January 2022) an implemented way to use ordered list in Pod6.
The historical design documents contain Pod6 syntax for ordered lists and, as far as I know, this remains something that we'd like to add. Once that syntax is implemented, you'll be able to write something like:
=item1 # Animal
=item2 # Vertebrate
=item2 # Invertebrate
=item1 # Phase
=item2 # Solid
=item2 # Liquid
=item2 # Gas
This would produce output along the lines of:
1. Animal
1.1 Vertebrate
1.2 Invertebrate
2. Phase
2.1 Solid
2.2 Liquid
2.3 Gas
(Though the exact syntax for rendering the list would be up to the implementation of the Pod renderer.)
But until that's implemented, there isn't any way to use Pod6 syntax to create an ordered list.
Edit:
I just checked the actual parsed Pod6, and it looks like (to my surprise) the ordered list syntax I showed above actually is parsed internally. For example, running say $=pod[5].raku with the Pod6 shows the following (based on the =item2 # Liquid line):
Pod::Item.new(level => 2, config => {:numbered(1)}, contents => [Pod::Block::Para.new(config => {}, contents => ["Liquid"])])
So the parsing work is in place; it's just the Pod::To::_ renderer that need to add support. (And there could even be some out there that have that support. I do know that neither Rakudo's Pod::To::Text nor Raku's Pod::To::HTML (v0.8.1) currently render ordered lists, however.)
Workarounds
Depending on the output formats you're targeting, you could of course write the ordered list yourself (pretty easy if you're rendering to plain text, more annoying to do if you're printing to HTML). This does, of course, sacrifice Pod6's multi-output-format support, which is one of its key features.
For a workaround that doesn't sacrifice Pod's multi-output nature, you'd probably want to look into manipulating/reformatting the Pod text programmatically. If you do so, the docs to start with are the Pod6 section on accessing Pod and the (unfortunately very short) section on the DOC phaser.
Just use a list and a loop?
my #list = [ (1, 2, 3), (1, 2, ),
[<a b c>, <d e f>],
[[1]]];
for #list -> #element {
say "{#element} → {#element.^name}";
for #element -> $sub-element {
say $sub-element;
}
}
# OUTPUT:
#1 2 3 → List
#1
#2
#3
#1 2 → List
#1
#2
#a b c d e f → Array
#(a b c)
#(d e f)
#1 → Array
#1

How to retrieve book's information in XML/JSON from library of congress by ISBN

The Library of Congress has a site to search books by ISBN. A simple way to retrive book's information is using a URL like:
http://lccn.loc.gov/2009019559/mods
where it returns a XML structure that may parse easily. The URL requires a unique LCCN number in the the following format:
http://lccn.loc.gov/[lccn]/mods
I have a batch of books that has ISBN encoded in barcode. How may I retrieve/convert ISBN to LCCN in order to retrieve the XML data of the book?
You can use the SRU catalog from the Library of Congress. The query would look something like this:
lx2.loc.gov:210/lcdb?version=1.1&operation=searchRetrieve&query=bath.isbn=[ISBN]&maximumRecords=1&recordSchema=mods
Replacing [ISBN] with the ISBN you want to look up
Within that response is an LCCN element. However, the catalog already returns MODS, so it might not be necessary to do anything at all.
You may use the Google Books API, for example: https://www.googleapis.com/books/v1/volumes?q=LCCN2001051058
Answer is in JSON format. It includes both ISBN-10 and ISBN-13 identifiers. You will have to batch the requests using your favorite programming language, in Pharo Smalltalk with PetitJson parser and Zinc with HTTPS support it would be:
| parser lccnCollection |
parser := PPParserResource current parserAt: PPJsonParser.
lccnCollection := #('2001051058' '2001051058').
lccnCollection do: [: lccnNumber |
| json jsonObject |
json := (Url absoluteFromText: 'https://www.googleapis.com/books/v1/volumes?q=LCCN' , lccnNumber) retrieveContents contents.
jsonObject := parser parse: json.
" ... retrieve ISSN from jsonObject, etc ... "].
Beware you may need an API key to make batch requests to Google.

ActiveRecord find when include?() matches

I have a Resource model with attribute subcategory_list, which is a comma-separated list of subcategories. Is it possible to do a find_by_x method (or equivalent) that pulls only those resources that belong to a certain subcategory?
So if given:
Resource.create(subcategory_list: "Fun, Games") # resource 1
Resource.create(subcategory_list: "Fun") # resource 2
Resource.create(subcategory_list: "Games") # resource 3
I would need a query to get both resources 1 and 2 when my input is "Fun". I can return ONLY "Fun" but not "Fun, Games" with the following
Resource.find_all_by_subcategory_list("Fun")
=> resource 2 (but not resource 1)
Is there a way to modify this query to include "Fun, Games" as well?
If subcategory_list is a comma-seperated string:
Resource.where('subcategory_list LIKE ?', "%Fun%")
If subcategory is an associated model:
Resource.joins(:subcategories).where('subcategories.name = ?', "Fun")
I agree with MrTheWalrus, but you might also want to check out acts_as_taggable_on by Michael Bleigh. It looks like what you are doing is tags associated with Resource and the acts_as_taggable_on gem will add a lot of power without having to write a lot of additional code, including what you are trying to do now as well.

Amazon API -- Can I search Category ALL - Other than DVD etc?

I Am trying to build play with API code from Amazon -- I am a noob at this --
I have created a product search using the simple lookup code, and have gone though and set the search field form a form submission works fine, how ever I don't want to set a category Like I am currently below to say DVD, BABY MUSIC, I wish to set to ALL is this possible?
include("amazon_api_class.php");
$obj = new AmazonProductAPI(); -- I have edited this and added ALL as a category in here
try
{
$result = $obj->searchProducts($query,
AmazonProductAPI::BABY, -- I can change this to DVD or MUSIC and it works but if i set to ALL i get errors?
"TITLE"); - tryed changing this to KEYWORD doesnt work!
}
catch(Exception $e)
Any Help Would Be nice.
Thanks
Carl
OK --- updated -- ANd I belive I have to use KEYWORD when USING ALL so I have added this in
case "KEYWORD" : $parameters = array("Operation" => "ItemSearch",
"Title" => $search,
"SearchIndex" => $category,
"ResponseGroup" => "Small",
"MerchantId" => "All",
"Condition"=>"New",
'Keywords' => $searchTerm);
Warning: Invalid argument supplied for foreach() in /data/ADMINwhere2shoponline/www/include/amazon.php on line 23
still get this error?
carl
Carl,
You should be able to use ALL as the search parameter, but you need to make sure that the number of ItemPage you are requesting is not more than 5 or it will return an error. All other categories allow up to 10, but ALL is limited to 5.
Check that and see if you yet your problem resolved.