FastText Vector to Numpy Array - numpy
I have used FastText to train sarcasm.txt file. The Text file looks like below (I have use Tokenization so it looks like this)
__label__1 [ ' beijing ' , ' fire ' , ' department ' , ' extinguishe ' , ' massive ' , ' fivealarm ' , ' burn ' , ' cloud ' , ' of ' , ' smog ' ]
__label__0 [ ' the ' , ' outfit ' , ' that ' , ' get ' , ' this ' , ' woman ' , ' kick ' , ' out ' , ' of ' , ' she ' , ' school ' , ' gym ' ]
__label__1 [ ' area ' , ' smoker ' , ' one ' , ' of ' , ' america ' , ' top ' , ' phlegmproducer ' ]
__label__1 [ ' ryan ' , ' seacr ' , ' nervous ' , ' about ' , ' how ' , ' audience ' , ' will ' , ' respond ' , ' to ' , ' slightly ' , ' short ' , ' haircut ' ]
__label__0 [ ' lot ' , ' of ' , ' parent ' , ' know ' , ' this ' , ' scenario ' ]
__label__1 [ ' great ' , ' daughter ' , ' measure ' , ' selfworth ' , ' against ' , ' some ' , ' 13yearold ' , ' name ' , ' skyla ' , ' now ' ]
__label__0 [ ' video ' , ' relaunche ' , ' investigation ' , ' into ' , ' death ' , ' of ' , ' man ' , ' hold ' , ' by ' , ' chicago ' , ' police ' ]
__label__0 [ ' trumpcare ' , ' score ' , ' so ' , ' badly ' , ' it ' , ' could ' , ' actually ' , ' help ' , ' the ' , ' senate ' ]
__label__1 [ ' break ' , ' flight ' , ' attendant ' , ' currently ' , ' attempt ' , ' to ' , ' pass ' , ' cup ' , ' of ' , ' cranberry ' , ' juice ' , ' over ' , ' your ' , ' laptop ' ]
__label__1 [ ' newly ' , ' swornin ' , ' north ' , ' korean ' , ' official ' , ' wonder ' , ' how ' , ' hell ' , ' eventually ' , ' be ' , ' execute ' ]
I used the code here to train this into FastText:
model_testing = fasttext.train_supervised('sarcasm_train.txt', minn=1, maxn=3)
I can access word embedding for a word using this code:
model_testing.get_word_vector('colorblind')
The vector looks like this:
array([ 0.02441351, -0.02762945, 0.0044672 , 0.02173712, -0.02845885,
-0.01788719, -0.01404536, -0.00040487, -0.02752208, -0.01858106,
0.00499649, -0.00495898, -0.02149334, -0.03080621, 0.01518185,
0.02701956, 0.00429512, -0.01376632, 0.00949627, 0.01379911,
0.01134215, -0.03492524, -0.02157653, 0.00827039, 0.00609733,
0.0153464 , 0.00993295, 0.00396577, 0.01779681, -0.01860862,
0.02510854, 0.0039301 , 0.01420423, 0.04053118, 0.05107417,
0.00303496, -0.00626393, 0.02165087, -0.00677479, 0.02452566,
-0.01753292, -0.0372146 , -0.0109925 , -0.01465018, 0.00677392,
-0.0062581 , -0.00041438, 0.01762512, 0.01957742, -0.02030487,
0.03129215, -0.02718819, 0.02537155, -0.02269336, 0.01991356,
0.01586418, 0.01099472, -0.00405847, -0.02595887, 0.0024101 ,
0.02718542, -0.01486973, -0.02627936, -0.03032344, 0.00531832,
0.00363665, 0.02218294, -0.01040652, -0.00741611, -0.02091474,
0.03373858, -0.01403952, 0.01352888, 0.01178332, 0.00370314,
0.01607108, -0.01730891, -0.0314983 , 0.00030702, 0.00751614,
0.0237149 , -0.0080571 , -0.02801514, 0.02703649, 0.00633383,
0.01455203, -0.00644819, 0.00479063, 0.02993772, 0.00366506,
-0.03687849, -0.01704783, -0.03367983, 0.03158782, -0.00666518,
0.02971195, -0.01610409, -0.02939436, 0.02555595, 0.0267504 ],
dtype=float32)
I have 3 Questions:
Does this vector look right? Should I skip Tokenization?
How can I access the vectors for all the words of all my words? Can I make a Numpy array consisting of the vectors?
Can I split those vectors based on Unigram, Bigram and Trigram? Each having the same 100 Features?
Related
I'm creating this trigger in SQL, but at the end of execution, when I enter more than one data, it changes all CPF's, how to solve?
I'm creating this trigger in SQL, but at the end of execution, when I enter more than one data, it changes all CPF's, how to solve? CREATE TRIGGER TGR_Limpa_Cpf ON tbl_Cobranca FOR INSERT AS BEGIN DECLARE #CPF VARCHAR (14), #CPF_Valido VARCHAR (11) SELECT #CPF = CPF FROM INSERTED SELECT #CPF_Valido = REPLACE(REPLACE(#CPF,'.', ''),'-', '') UPDATE tbl_Cobranca set CPF = #CPF_Valido END GO When I run the insert into, like the ones below, it sets in the CPF column of the 10 entries, the last CPF. What do I do? INSERT INTO tbl_Cobranca VALUES ( '600.599.013-49',146024000,742.66, '19/07/2019 ' ,1, ' 62085637761 ' , 'Inativo ', 'Devolvido ', ' 30/03/2020 ' , ' PGMAIS ' ,250,NULL,2, ' Lindseia Domingos Bacaicoa ' ) INSERT INTO tbl_Cobranca VALUES ( '347.275.181-20',23100372,1081.22, '18/03/2020 ' ,4, ' 61354685318 ' , 'Inativo ', 'Devolvido ', ' 08/04/2020 ' , ' PGMAIS ' ,30,NULL,2, ' Antonio Gaetano Franco ' ) INSERT INTO tbl_Cobranca VALUES ( '375.416.474-28',10062872,1235.58, '22/08/2019 ' ,2, ' 45569215172 ' , 'Ativo ', 'Ativo ', ' 01/04/2020 ' , ' PGMAIS ' ,336,NULL,1, ' Waldecy Aranha De Souza ' ) INSERT INTO tbl_Cobranca VALUES ( '620.421.024-25',9080169,709.78, '02/12/2018 ' ,5, ' 20637885865 ' , 'Ativo ', 'Ativo ', ' 01/04/2020 ' , ' PGMAIS ' ,599,NULL,1, ' Josiane Goncalves De Oliveira ' ) INSERT INTO tbl_Cobranca VALUES ( '587.715.716-59',12067127,568.84, '12/05/2020 ' ,13, ' 30940355198 ' , 'Ativo ', 'Ativo ', ' 01/04/2020 ' , ' PGMAIS ' ,72,NULL,1, ' Vandeir Pedro Goncalves ' ) INSERT INTO tbl_Cobranca VALUES ( '551.112.540-72',97169365,886.2, '07/05/2020 ' ,0, ' 75769424904 ' , 'Inativo ', 'Devolvido ', ' 28/05/2020 ' , ' PGMAIS ' ,29,NULL,2, ' Sergio Aparecido De Vasconcelos ' ) INSERT INTO tbl_Cobranca VALUES ( '544.877.944-08',148016204,649.23, '13/03/2020 ' ,3, ' 13896261143 ' , 'Inativo ', 'Devolvido ', ' 06/04/2020 ' , ' PGMAIS ' ,42,NULL,2, ' Jocelia Giboski ' ) INSERT INTO tbl_Cobranca VALUES ( '530.036.855-73',137014060,677.51, '17/02/2017 ' ,0, ' 63755063318 ' , 'Inativo ', 'Devolvido ', ' 30/03/2020 ' , ' PGMAIS ' ,1132,NULL,2, ' Carlos Roberto Toniolli ' ) INSERT INTO tbl_Cobranca VALUES ( '361.078.484-75',146025052,995.47, '19/03/2020 ' ,4, ' 43587796897 ' , 'Ativo ', 'Ativo ', ' 01/04/2020 ' , ' PGMAIS ' ,126, ' RENAULT ' ,1, ' Rosana De Fatima Mendonca ' ) INSERT INTO tbl_Cobranca VALUES ( '260.057.428-95',97140256,1043.69, '08/12/2019 ' ,5, ' 78098625396 ' , 'Ativo ', 'Ativo ', ' 01/04/2020 ' , ' PGMAIS ' ,228,NULL,1, ' Percilia Santos Do Amaral ' )
How to Select 2 Different Names from same table and display them on the Condition
I have this Query SELECT NAME_NO ,( SELECT FNAME || ' ' || LNAME || ' ' || BIRTH_DT || ' ' || ' ' || PHONE FROM NAMES WHERE NAME_NO = 1 ) AS "NAME1: NAME, DOB, PHONE" ,( SELECT FNAME || ' ' || LNAME || ' ' || BIRTH_DT || ' ' || ' ' || PHONE FROM NAMES WHERE NAME_NO = 2 ) AS "NAME2: NAME, DOB, PHONE" , FROM NAMES; I get this error: 01427. 00000 - "single-row subquery returns more than one row" I need multiple records. What is the best method to solve this?
Try this one: SELECT NAME_NO, FNAME || ' ' || LNAME || ' ' || BIRTH_DT || ' ' || ' ' || PHONE AS "NAME1: NAME, DOB, PHONE" FROM NAMES UNION SELECT FNAME || ' ' || LNAME || ' ' || BIRTH_DT || ' ' || ' ' || PHONE AS "NAME2: NAME, DOB, PHONE" FROM NAMES WHERE NAME_NO = 2 or this one: WITH N1 AS ( SELECT NAME_NO,FNAME || ' ' || LNAME || ' ' || BIRTH_DT || ' ' || ' ' || PHONE AS "VAL" FROM NAMES WHERE NAME_NO = 1), N2 AS ( SELECT FNAME || ' ' || LNAME || ' ' || BIRTH_DT || ' ' || ' ' || PHONE AS "VAL" FROM NAMES WHERE NAME_NO = 2) SELECT NAME_NO,VAL FROM N1,N2;
You need to use PIVOT. Try this Select A "NAME1: NAME, DOB, PHONE" , B "NAME2: NAME, DOB, PHONE" from (SELECT FNAME || ' ' || LNAME || ' ' || BIRTH_DT || ' ' || ' ' || PHONE N, NAME_NO FROM NAMES) Pivot (Max(N) for NAME_NO in (1 as A, 2 as B) );
SQL Developer Query returning blank output
I want to only see all records that are in position "substr(B.GLDEBITACCT,24,8)" from the string below 'FDN-XXXXXX-XXXXXXX-XXX-XXTXXXXX-0000-0000' So in my query displayed below, I chose to specifically pull in all records by the letter 'T' so I wrote the syntax as AND SUBSTR(M.GLDEBITACCT,24,8) = 'T'. Now this is giving me a blank output. Can someone help me decipher what I am not doing right here? I hope my question is succinct enough. SELECT SUBSTR(M.GLDEBITACCT,24,8) as PROJECT, G.COMPTEXT AS PROJECT_NAME, ' ' PROJECT_LEADER, TO_CHAR(W.EX2DERNUM) AS DERNUM, ' ' AS DERLINENUM, ' ' AS REQUESTNUM, M.REFWO AS WONUM, W.PARENT AS PARENT_WONUM, ' ' AS PRNUM, ' ' AS PRLINENUM, ' ' AS PR_STATUS, TO_CHAR(M.PONUM) AS PO_NUMBER, TO_CHAR(M.POLINENUM) AS POLINE_NUMBER, ' ' AS PO_STATUS, ' ' AS PO_REVISIONNUM, ' ' AS PO_VENDOR_NUM, ' ' AS VENDOR, M.ITEMNUM, M.DESCRIPTION, I.ISSUEUNIT AS UOM, TO_CHAR(M.QTYREQUESTED) AS WO_QTY_REQ, TO_CHAR(M.QUANTITY) AS WO_QTY_RECEIVED, TO_CHAR(M.UNITCOST) AS WO_UNITCOST, TO_CHAR(M.LINECOST) AS WO_LINECOST, M.ISSUETYPE, TO_CHAR(M.ACTUALDATE) AS ISSUEDATE, W.STATUS AS WONUM_STATUS, ' ' AS PR_QTY_REQ, ' ' AS PO_QTY_REQ, ' ' AS ACTUAL_QTY_RECEIVED, ' ' AS PO_QTY_RECEIVED, ' ' AS PO_QTY_OPEN, ' ' AS PO_UNITCOST, ' ' AS PO_LINECOST, ' ' AS SHIPTO, ' ' AS DROP_SHIP, ' ' AS ENTERDATE, ' ' AS REQDELIVERYDATE, ' ' AS VENDELIVERYDATE, ' ' AS STATUSDATE, ' ' AS RECEIPTDATE, ' ' AS DIRECT_CHARGE, ' ' AS VENDROR_COMMENT FROM MSCRADS.MATUSETRANS M LEFT OUTER JOIN MXRADS.WORKORDER W ON M.REFWO = W.WONUM LEFT OUTER JOIN MXRADS.ITEM I ON M.ITEMNUM = I.ITEMNUM LEFT OUTER JOIN MXRADS.VW_GLCOMPONENTS G ON SUBSTR(M.GLDEBITACCT,24,8) = G.COMPVALUE WHERE M.ISSUETYPE IN ('ISSUE','RETURN') AND M.SITEID = 'FDN' AND M.PONUM IS NULL AND M.ACTUALDATE >= TO_DATE('2014-05-26','YYYY-MM-DD') AND SUBSTR(M.GLDEBITACCT,24,8) = 'T' ORDER BY 1
OLEDB Jet 4.0 SQL Character Limit
I am attempting to generate lines for an SDF (Space Delimited File). I am creating these lines from a DBASE IV DBF file using an OLEDB adapter with extended properties DBASEIV to get at the data. My data column output is 425 characters long after padding, I am placing this into a datagridview in VB.NET to display it. However when I run my query, while it seems to execute correctly the resultant field is restricted to 256 characters. The longest individual field I am reading is 35 characters and I am returning a dataset with 2 fields, the barcode and the SDF line. As I understand it OLEDB Jet 4.0 tries to guess the type based on the first 8 rows, however as all rows are equal length for the data column (425 chars) I don't get why it is choosing the smaller field type. I assume it is because my field is a generated one using string concatenation. I have included the horrible SQL at the bottom of this question. So my question is how can I get the full 425 character output? Or is there a way I can specify the datatype for my own field as memo? SELECT scan, RIGHT('0000000000000' + trim(cstr(scan)), 13) + LEFT(trim(cstr(name)) + ' ', 35) + LEFT(trim(cstr(name)) + ' ', 16) + ' ' + ' ' + ' ' + '1 ' + '0.00 ' + '0.00 ' + '1' + '0.00 ' + '0.01 ' + '0.00 ' + 'F' + '2' + '0.00 ' + '0.00 ' + ' ' + ' ' + ' ' + 'SALS' + ' ' + ' ' + LEFT(trim(cstr(plof)) + ' ', 13) + ' ' + ' ' + '0.00 ' + '0.00 ' + '0.00 ' + 'F' + 'T' + '001' + ' ' + 'T' + '01' + ' ' + ' ' + ' ' + ' ' + 'F' + 'F' + ' ' + ' ' + '0 ' + '0.00 ' + ' ' + '0 ' + '0.00 ' + '0.00 ' + '0.00 ' + '0.00 ' + '1 ' + '1 ' + '1 ' + '0.00 ' + '0.00 ' + '0.00 ' + '0.00 ' + '0.00 ' + '0.00 ' + ' ' + ' ' + ' ' as STTEMPLINE from salus where cstr(scan) in (select distinct cstr(scan) from nonscan) Thanks in advance for any help.
This is a known issue where data types are guessed based on data in the first few rows. See this post... Load Excel sheet into DataTable but only the first 256 chars per cell are imported, Also, see this post for an explanation of how OLEDB deals with mixed data types... http://social.msdn.microsoft.com/Forums/pl-PL/csharplanguage/thread/0404d003-5bfb-44f9-8e6b-aebdfce24875
space in a select statement in dynamic query
I have a dynamic query like this : SET #str_Query = 'SELECT SIM.Item_ID, SIM.Item_Description, SU.Short_Description AS Unit, SIM.Std_Lead_Time,'+ '' ''+' AS Last_Purchase_Rate FROM FKMS_Item_Master AS SIM INNER JOIN FKMS_STP_Units SU ON SIM.Item_Purchase_Unit=SU.Unit_Id' + ' WHERE ' + #str_Condition + ' AND SIM.Location_Id =' + CAST(#aint_Location_Id AS VARCHAR(10)) + ' AND SIM.Item_Deleted =0 AND SIM.Approved_On IS NOT NULL' +' ORDER BY SIM.Item_Description' I want to retrieve space as Last_Purchase_Rate It is showing syntax error in the portion of '' ''+' AS Last_Purchase_Rate when I execute this query. If I print this dynamic query, query seems correct. It shows as AS Last_Purchase_Rate with space before AS. Please help.
I would write ...SIM.Std_Lead_Time, '' '' AS Last_Purchase_Rate... instead of ...SIM.Std_Lead_Time,'+'' ''+' AS Last_Purchase_Rate...
Why not use NULL instead of space and then handle the result in your app? I.e., SET #str_Query = 'SELECT SIM.Item_ID, SIM.Item_Description, SU.Short_Description AS Unit, SIM.Std_Lead_Time, NULL AS Last_Purchase_Rate, -- and so on. You could also use CHAR(32): SET #str_Query = 'SELECT SIM.Item_ID, SIM.Item_Description, SU.Short_Description AS Unit, SIM.Std_Lead_Time, CHAR(32) AS Last_Purchase_Rate, -- and so on.
You did not escape all quotes. A working version of your statement would be SET #str_Query = 'SELECT SIM.Item_ID, SIM.Item_Description, SU.Short_Description AS Unit, SIM.Std_Lead_Time,' + ''' ''' + ' AS Last_Purchase_Rate FROM FKMS_Item_Master AS SIM INNER JOIN FKMS_STP_Units SU ON SIM.Item_Purchase_Unit=SU.Unit_Id' + ' WHERE ' + #str_Condition + ' AND SIM.Location_Id =' + CAST(#aint_Location_Id AS VARCHAR(10)) + ' AND SIM.Item_Deleted =0 AND SIM.Approved_On IS NOT NULL' +' ORDER BY SIM.Item_Description' but I find that with a little reformatting, the error is easier to spot SET #str_Query = 'SELECT SIM.Item_ID ' + ', SIM.Item_Description ' + ', SU.Short_Description AS Unit ' + ', SIM.Std_Lead_Time ' + ', '' ''' + ' AS Last_Purchase_Rate ' + 'FROM FKMS_Item_Master AS SIM ' + ' INNER JOIN FKMS_STP_Units SU ' + ' ON SIM.Item_Purchase_Unit=SU.Unit_Id ' + ' WHERE ' + #str_Condition + ' AND SIM.Location_Id = ' + CAST(#aint_Location_Id AS VARCHAR(10)) + ' AND SIM.Item_Deleted =0 ' + ' AND SIM.Approved_On IS NOT NULL ' + ' ORDER BY SIM.Item_Description '
Try using tsql function SPACE(1)