Kotlin using two readln but only the first one get the input - kotlin

I using kotlin to get two lines of input
val number1 = readln()
val number2 = readln()
println(" number 1 is $number1 number 2 is $number2")
Result is :
1
1
number 1 is 1 number 2 is

Related

Finding the contiguous sequences of equal elements in a list Raku

I'd like to find the contiguous sequences of equal elements (e.g. of length 2) in a list
my #s = <1 1 0 2 0 2 1 2 2 2 4 4 3 3>;
say grep {$^a eq $^b}, #s;
# ==> ((1 1) (2 2) (4 4) (3 3))
This code looks ok but when one more 2 is added after the sequence of 2 2 2 or when one 2 is removed from it, it says Too few positionals passed; expected 2 arguments but got 1 How to fix it? Please note that I'm trying to find them without using for loop, i.e. I'm trying to find them using a functional code as much as possible.
Optional: In the bold printed section:
<1 1 0 2 0 2 1 2 2 2 4 4 3 3>
multiple sequences of 2 2 are seen. How to print them the number of times they are seen? Like:
((1 1) (2 2) (2 2) (4 4) (3 3))
There are an even number of elements in your input:
say elems <1 1 0 2 0 2 1 2 2 2 4 4 3 3>; # 14
Your grep block consumes two elements each time:
{$^a eq $^b}
So if you add or remove an element you'll get the error you're getting when the block is run on the single element left over at the end.
There are many ways to solve your problem.
But you also asked about the option of allowing for overlapping so, for example, you get two (2 2) sub-lists when the sequence 2 2 2 is encountered. And, in a similar vein, you presumably want to see two matches, not zero, with input like:
<1 2 2 3 3 4>
So I'll focus on solutions that deal with those issues too.
Despite the narrowing of solution space to deal with the extra issues, there are still many ways to express solutions functionally.
One way that just appends a bit more code to the end of yours:
my #s = <1 1 0 2 0 2 1 2 2 2 4 4 3 3>;
say grep {$^a eq $^b}, #s .rotor( 2 => -1 ) .flat
The .rotor method converts a list into a list of sub-lists, each of the same length. For example, say <1 2 3 4> .rotor: 2 displays ((1 2) (3 4)). If the length argument is a pair, then the key is the length and the value is an offset for starting the next pair. If the offset is negative you get sub-list overlap. Thus say <1 2 3 4> .rotor: 2 => -1 displays ((1 2) (2 3) (3 4)).
The .flat method "flattens" its invocant. For example, say ((1,2),(2,3),(3,4)) .flat displays (1 2 2 3 3 4).
A perhaps more readable way to write the above solution would be to omit the flat and use .[0] and .[1] to index into the sub-lists returned by rotor:
say #s .rotor( 2 => -1 ) .grep: { .[0] eq .[1] }
See also Elizabeth Mattijsen's comment for another variation that generalizes for any sub-list size.
If you needed a more general coding pattern you might write something like:
say #s .pairs .map: { .value xx 2 if .key < #s - 1 and [eq] #s[.key,.key+1] }
The .pairs method on a list returns a list of pairs, each pair corresponding to each of the elements in its invocant list. The .key of each pair is the index of the element in the invocant list; the .value is the value of the element.
.value xx 2 could have been written .value, .value. (See xx.)
#s - 1 is the number of elements in #s minus 1.
The [eq] in [eq] list is a reduction.
If you need text pattern matching to decide what constitutes contiguous equal elements you might convert the input list into a string, match against that using one of the match adverbs that generate a list of matches, then map from the resulting list of matches to your desired result. To match with overlaps (eg 2 2 2 results in ((2 2) (2 2)) use :ov:
say #s .Str .match( / (.) ' ' $0 /, :ov ) .map: { .[0].Str xx 2 }
TIMTOWDI!
Here's an iterative approach using gather/take.
say gather for <1 1 0 2 0 2 1 2 2 2 4 4 3 3> {
state $last = '';
take ($last, $_) if $last == $_;
$last = $_;
};
# ((1 1) (2 2) (2 2) (4 4) (3 3))

Changing names of variables using the values of another variable

I am trying to rename around 100 dummy variables with the values from a separate variable.
I have a variable products, which stores information on what products a company sells and have generated a dummy variable for each product using:
tab products, gen(productid)
However, the variables are named productid1, productid2 and so on. I would like these variables to take the values of the variable products instead.
Is there a way to do this in Stata without renaming each variable individually?
Edit:
Here is an example of the data that will be used. There will be duplications in the product column.
And then I have run the tab command to create a dummy variable for each product to produce the following table.
sort product
tab product, gen(productid)
I noticed it updates the labels to show what each variable represents.
What I would like to do is to assign the value to be the name of the variable such as commercial to replace productid1 and so on.
Using your example data:
clear
input companyid str10 product
1 "P2P"
2 "Retail"
3 "Commercial"
4 "CreditCard"
5 "CreditCard"
6 "EMFunds"
end
tabulate product, generate(productid)
list, abbreviate(10)
sort product
levelsof product, local(new) clean
tokenize `new'
ds productid*
local i 0
foreach var of varlist `r(varlist)' {
local ++i
rename `var' ``i''
}
Produces the desired output:
list, abbreviate(10)
+---------------------------------------------------------------------------+
| companyid product Commercial CreditCard EMFunds P2P Retail |
|---------------------------------------------------------------------------|
1. | 3 Commercial 1 0 0 0 0 |
2. | 5 CreditCard 0 1 0 0 0 |
3. | 4 CreditCard 0 1 0 0 0 |
4. | 6 EMFunds 0 0 1 0 0 |
5. | 1 P2P 0 0 0 1 0 |
6. | 2 Retail 0 0 0 0 1 |
+---------------------------------------------------------------------------+
Arbitrary strings might not be legal Stata variable names. This will happen if they (a) are too long; (b) start with any character other than a letter or an underscore; (c) contain characters other than letters, numeric digits and underscores; or (d) are identical to existing variable names. You might be better off making the strings into variable labels, where only an 80 character limit bites.
This code loops over the variables and does its best:
gen long obs = _n
foreach v of var productid? productid?? productid??? {
su obs if `v' == 1, meanonly
local tryit = product[r(min)]
capture rename `v' `=strtoname("`tryit'")'
}
Note: code not tested.
EDIT: Here is a test. I added code for variable labels. The data example and code show that repeated values and values that could not be variable names are accommodated.
clear
input str13 products
"one"
"two"
"one"
"three"
"four"
"five"
"six something"
end
tab products, gen(productsid)
gen long obs = _n
foreach v of var productsid*{
su obs if `v' == 1, meanonly
local value = products[r(min)]
local tryit = strtoname("`value'")
capture rename `v' `tryit'
if _rc == 0 capture label var `tryit' "`value'"
else label var `v' "`value'"
}
drop obs
describe
Contains data
obs: 7
vars: 7
size: 133
-------------------------------------------------------------------------------
storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
products str13 %13s
five byte %8.0g five
four byte %8.0g four
one byte %8.0g one
six_something byte %8.0g six something
three byte %8.0g three
two byte %8.0g two
-------------------------------------------------------------------------------
Another solution is to use the extended macro function
local varlabel:variable label
The tested code is:
clear
input companyid str10 product
1 "P2P"
2 "Retail"
3 "Commercial"
4 "CreditCard"
5 "CreditCard"
6 "EMFunds"
end
tab product, gen(product_id)
* get the list of product id variables
ds product_id*
* loop through the product id variables and change the
variable name to its label
foreach var of varlist `r(varlist)' {
local varlabel: variable label `var'
display "`varlabel'"
local pos = strpos("`varlabel'","==")+2
local varlabel = substr("`varlabel'",`pos',.)
display "`varlabel'"
rename `var' `varlabel'
}

Add 'document_id' column to pandas dataframe of word-id's and wordcounts

I have following dataset:
import pandas as pd
jsonDF = pd.DataFrame({'DOCUMENT_ID':[263403828328665088,264142543883739136], 'MESSAGE':['#Zuora wants to help #Network4Good with Hurric...','#ztrip please help spread the good word on hel...']})
DOCUMENT_ID MESSAGE
0 263403828328665088 #Zuora wants to help #Network4Good with Hurric...
1 264142543883739136 #ztrip please help spread the good word on hel...
I am trying to reshape my data in the form of
docID wordID count
0 1 118 1
1 1 285 1
2 1 1229 1
3 1 1688 1
4 1 2068 1
I used following
r=[]
for i in jsonDF['MESSAGE']:
for j in sortedValues(wordsplit(i)):
r.append(j)
IDCount_Re=pd.DataFrame(r)
IDCount_Re[:5]
gives me following result
0 17
1 help 2
2 wants 1
3 hurricane 1
4 relief 1
5 text 1
6 sandy 1
7 donate 1
8 6
9 please 1
I can get word counts
I have no idea to to append Document_ID to the in the above dataframe.
Following functions were used to split words
from nltk.corpus import stopwords
import re
def wordsplit(wordlist):
j=wordlist
j=re.sub(r'\d+', '', j)
j=re.sub('RT', '',j)
j=re.sub('http', '', j)
j = re.sub("(#[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)", " ", j)
j=j.lower()
j=j.strip()
if not j in stopwords.words('english'):
yield j
def wordSplitCount(wordlist):
'''merges a list into string, splits it, removes stop words and
then counts the occurrences returning an ordered dictitonary'''
#stopwords=set(stopwords.words('english'))
string1=''.join(list(itertools.chain(filter(None, wordlist))))
cnt=Counter()
j = []
for i in string1.split(" "):
i=re.sub(r'&', ' ', i.lower())
if i not in stopwords.words('english'):
cnt[i]+=1
return OrderedDict(cnt)
def sortedValues(wordlist):
'''creates a dictionary list of occurenced w/ values descending'''
d=wordSplitCount(wordlist)
return sorted(d.items(), key=lambda t: t[1], reverse=True)
UPDATE: SOLUTION HERE:
string split and and assign unique ids to Pandas DataFrame
'DOCUMENT_ID' is one of the two fields in each row of jsonDF. Your current code doesn't access it because it directly works on jsonDF['MESSAGE'].
Here is some non-working pseudocode - something like:
for _, row in jsonDF.iterrows():
doc_id, msg = row
words = [word for word in wordsplit(msg)][0].split() # hack
wordcounts = Counter(words).most_common() # sort by decr frequency
Then do a pd.concat(pd.DataFrame({'DOCUMENT_ID': doc_id, ...
and get the 'wordId' and 'count' fields from wordcounts.

I don't understand how to use Modulus [duplicate]

This question already has answers here:
How Does Modulus Divison Work
(19 answers)
Closed 5 years ago.
I am brand new to coding and I am having a lot of trouble just getting this to work:
Declare 2 integer variables. Prompt the user for two numbers.
Store the values into variables using the Scanner object.
If the second number is a multiple of the first number, display " is a multiple of ". Otherwise display " is not a multiple of "
I have this code below then i am lost:
int number1, number2;
System.out.println("Enter a number:");
number1 = keyboard.nextInt();
System.out.println("Enter a Number:");
number2 = keyboard.nextInt();
Modulus is the remainder of dividing 2 numbers. If number2 is a multiple of number1, dividing them will have a remainder 0, otherwise not.
package test;
import java.util.Scanner;
public class MyTest {
public static void main(String[] args) {
Scanner keyboard = new Scanner(System.in);
int number1, number2;
System.out.println("Enter a number:");
number1 = keyboard.nextInt();
System.out.println("Enter a Number:");
number2 = keyboard.nextInt();
int mod = number2 % number1;
if (mod == 0) {
System.out.println(number2 + " is a multiple of " + number1);
} else {
System.out.println(number2 + " is not a multiple of " + number1);
}
}
}
Modulus simply gives you the remainder of the division of two numbers. For eg : 4/3 gives the remainder as 1 similarly in programmatic way you can try 4%3 which two will give the same response. Added a python code snippet below :
a = int(input("Enter first no."))
b = int(input("Enter second no."))
print("Modulus operation a%b gives : ", (a % b))
print("Is ", a, " multiple of ", b)
print((a%b) == 0)
If number1 is a multiple of number2 than it will devide number2 with remainder 0,
for ex if 4 is multiple of 2 than 4%2==0, is true ,
here % is modulus operator that returns remainter of the division of two numbers :::
In your case if number1 is multiple of number2 than
Number1%number2==0 (must);
Than your program should be like this:
If(number1%number2==0)
{
System.out.println("number1 is a multiple of number 2");
}
Else
System.out.println("not a multiple");
i am lost
Suppose you are in a competition to eat chocolates and the rule is:
Eat three chocolates at the same time
If you are given:
1 chocolate, you will not eat, and 1 chocolate will be left
2 chocolates, you will not eat, and 2 chocolates will be left
3 chocolates, you eat once, and 0 chocolates will be left
4 chocolates, you eat once, and 1 chocolate will be left
5 chocolates, you eat once, and 2 chocolates will be left
6 chocolates, you eat twice, and 0 chocolates will be left
7 chocolates, you eat twice, and 1 chocolate will be left
...
100 chocolates, you eat 33 times, 1 chocolate will be left
...
1502 chocolates, you eat 500 times, 2 chocolates will be left
The number of chocolates left each time is called Modulus is Mathematics, a powerful operation in computers.
As you can tell, you will always be left with less than 3 chocolates (0 or 1 or 2) anytime.
Hence, x modulus y is always less than y in the range 0 to y-1.
In programming, the modulus operator is %.
So, if we write the way you ate chocolates using the modulus operator %, it looks like:
1 % 3 = 1
2 % 3 = 2
3 % 3 = 0
4 % 3 = 1
5 % 3 = 2
6 % 3 = 0
7 % 3 = 1
...
100 % 3 = 1
...
1502 % 3 = 2
Now tell me, are you lost?

How to filter after group by and aggregate in Spark dataframe?

I have a spark dataframe df with schema as such:
[id:string, label:string, tags:string]
id | label | tag
---|-------|-----
1 | h | null
1 | w | x
1 | v | null
1 | v | x
2 | h | x
3 | h | x
3 | w | x
3 | v | null
3 | v | null
4 | h | null
4 | w | x
5 | w | x
(h,w,v are labels. x can be any non-empty values)
For each id, there is at most one label "h" or "w", but there might be multiple "v". I would like to select all the ids that satisfies following conditions:
Each id has:
1. one label "h" and its tag = null,
2. one label "w" and its tag != null,
3. at least one label "v" for each id.
I am thinking that I need to create three columns checking each above conditions. And then I need to do a group by "id".
val hCheck = (label: String, tag: String) => {if (label=="h" && tag==null) 1 else 0}
val udfHCheck = udf(hCheck)
val wCheck = (label: String, tag: String) => {if (label=="w" && tag!=null) 1 else 0}
val udfWCheck = udf(wCheck)
val vCheck = (label: String) => {if (label==null) 1 else 0}
val udfVCheck = udf(vCheck)
dfx = df.withColumn("hCheck", udfHCheck(col("label"), col("tag")))
.withColumn("wCheck", udfWCheck(col("label"), col("tag")))
.withColumn("vCheck", udfVCheck(col("label")))
.select("id","hCheck","wCheck","vCheck")
.groupBy("id")
Somehow I need to group three columns {"hCheck","wCheck","vCheck"} into vector of list [x,0,0],[0,x,0],[0,0,x]. And check if these vector contain all three {[1,0,0],[0,1,0],[0,0,1]}
I have not been able to solve this problem yet. And there might be a better approach than this one. Hope someone can give me suggestions. Thanks
To convert the three checks to vectors you can do:
Specifically you can do:
val df1 = df.withColumn("hCheck", udfHCheck(col("label"), col("tag")))
.withColumn("wCheck", udfWCheck(col("label"), col("tag")))
.withColumn("vCheck", udfVCheck(col("label")))
.select($"id",array($"hCheck",$"wCheck",$"vCheck").as("vec"))
Next the groupby returns a grouped object on which you need to perform aggregations. Specifically to get all the vectors you should do something like:
.groupBy("id").agg(collect_list($"vec"))
Also you do not need udfs for the various checks. You can do it with column semantics. For example udfHCheck can be written as:
with($"label" == lit("h") && tag.isnull 1).otherwise(0)
BTW, you said you wanted a label 'v' for each but in vcheck you just check if the label is null.
Update: Alternative solution
Upon looking on this question again, I would do something like this:
val grouped = df.groupBy("id", "label").agg(count("$label").as("cnt"), first($"tag").as("tag"))
val filtered1 = grouped.filter($"label" === "v" || $"cnt" === 1)
val filtered2 = filtered.filter($"label" === "v" || ($"label" === "h" && $"tag".isNull) || ($"label" === "w" && $"tag".isNotNull))
val ids = filtered2.groupBy("id").count.filter($"count" === 3)
The idea is that first we groupby BOTH id and label so we have information on the combination. The information we collect is how many values (cnt) and the first element (doesn't matter which).
Now we do two filtering steps:
1. we need exactly one h and one w and any number of v so the first filter gets us these cases.
2. we make sure all the rules are met for each of the cases.
Now we have only combinations of id and label which match the rules so in order for the id to be legal we need to have exactly three instances of label. This leads to the second groupby which simply counts the number of labels which matched the rules. We need exactly three to be legal (i.e. matched all the rules).