I am getting a correct answer on my compiler, but I am getting a run time error on hacker rank. Solution for stock maximize problem. I am new at python, therefore I am having difficulty in removing error.Inputs are of this form
1 //no of test cases
3 //no of stocks
5 2 3 //cost of stocks
I think error is in taking input as 5 3 2 (continuous). If I am taking as
5
3
2
(one on each line)then it is working fine. How can I fix this problem?
t=int(input())
list=[]
while t>0:
n=int(input())
list.clear()
for i in range(0,n):
list.append(int(input()))
sum=0
print('hello')
max=list[n-1]
for i in range(n-2,-1,-1):
if(list[i]<max):
sum=sum+(max-list[i])
else:
max=list[i]
print(sum)
t=t-1
You can read the //cost of stocks like so in Python3
stockcost=[int(k) for k in input().split()]
This will create a list of stock prices
Instead of trying to read in three lines of stock costs when there is actually only one line of three space-separated costs, you need to read in that one line and split it into a list of integers, for example like this (since it looks like you're using Python 3):
stocks = list(map(int, input().split(" ")))
With your example, this would give you the list [5, 3, 2].
Related
I am trying to visualize changes in gene expression as categorical variables (up, down, no change) over various timepoints.
I have a dataframe describing differential expression data that looks like this:
data = {'gene':['Svm3G0018840','Svm5G0011050','Svm9G0059770'],
'01h': ['nc','up','down'], '04h': ['up', 'down', 'nc'],'08h':['nc','down','up']}
df=pd.DataFrame.from_dict(data)
df=df.set_index('gene')
I can use this df to create the parallel plot using the following code:
fig = px.parallel_categories(herbdf, dimensions=['01h', '04h', '08h','24h','48h'],
labels={'01h':'', '04h':'', '08h':'','24h':'','48h':''})
fig.show()
However, the categories (up, down, nc) are not always in the same order for every time point which makes the figure very difficult to read. I can change this in the interactive figure in a notebook, but I only have the option to output the corrected figure as a low quality png. I need the image in an svg format, which means I need to use the line:
fig.write_image("/figs/herb_de_pp.svg")
But when I add this line in the code block to save the figure I have no control of the order the categorical boxes end up in:
I have tried to add fig.update_ lines to solve this problem, such as:
fig.update_layout(xaxis={'categoryorder':'total descending'})
but this doesn't seem to change the output at all.
I could be missing something simple- any help would be much appreciated!
Parallel coordinates diagrams don't have xaxis/yaxis properties, you need to update traces in order to change the dimensions order:
dimensions = ['01h', '04h', '08h','24h','48h']
...
fig.update_traces(dimensions=[{"categoryorder": "category descending"} for _ in dimensions])
not great answer here, but something that I think will work in a pinch...
It looks like the order of the categories of each figure/column come from the order that they are in the original dataset. That is, in your first column, nc is the first unique item, then down is the second unique item, up is third.
So, if you can rearrange/sort your data so that the data shows up in the order you want it displayed, that should work.
Have your first row be nc | nc | nc | nc | nc, second row down | down | down | down | down, and third row up | up | up | up | up (assuming you actually have records like that). That should do it, but isn't very elegant...
Given the above solution, this is the line needed to sort the dataframe and produce the figure with ordered categories:
sorteddf = df.sort_values(by=['01h','04h','08h'], axis=0, ascending=False)
I'm following a tutorial on NLP but have encountered a key error error when trying to group my raw data into good and bad reviews. Here is the tutorial link: https://towardsdatascience.com/detecting-bad-customer-reviews-with-nlp-d8b36134dc7e
#reviews.csv
I am so angry about the service
Nothing was wrong, all good
The bedroom was dirty
The food was great
#nlp.py
import pandas as pd
#read data
reviews_df = pd.read_csv("reviews.csv")
# append the positive and negative text reviews
reviews_df["review"] = reviews_df["Negative_Review"] +
reviews_df["Positive_Review"]
reviews_df.columns
I'm seeing the following error:
File "pandas\_libs\hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Negative_Review'
Why is this happening?
You're getting this error because you did not understand how to structure your data.
When you do df['reviews']=df['Positive_reviews']+df['Negative_reviews'] you're actually summing the values of Positive reviews to Negative reviews(which does not exist currently) into the 'reviews' column (chich also does not exist).
Your csv is nothing more than a plaintext file with one text in each row. Also, since you're working with text, remember to enclose every string in quotation marks("), otherwise your commas will create fakecolumns.
With your approach, it seems that you'll still tag all your reviews manually (usually, if you're working with machine learning, you'll do this outside code and load it to your machine learning file).
In order for your code to work, you want to do the following:
import pandas as pd
df = pd.read_csv('TestFileFolder/57886076.csv', names=['text'])
## Fill with placeholder values
df['Positive_review']=0
df['Negative_review']=1
df.head()
Result:
text Positive_review Negative_review
0 I am so angry about the service 0 1
1 Nothing was wrong, all good 0 1
2 The bedroom was dirty 0 1
3 The food was great 0 1
However, I would recommend you to have a single column (is_review_positive) and have it to true or false. You can easily encode it later on.
I have a DataFrame and I want to display the frequencies for certain values in a certain Series using pd.Series.value_counts().
The problem is that I only see truncated results in the output. I'm coding in Jupyter Notebook.
I have tried unsuccessfully a couple of methods:
df = pd.DataFrame(...) # assume df is a DataFrame with many columns and rows
# 1st method
df.col1.value_counts()
# 2nd method
print(df.col1.value_counts())
# 3rd method
vals = df.col1.value_counts()
vals # neither print(vals) doesn't work
# All output something like this
value1 100000
value2 10000
...
value1000 1
Currently this is what I'm using, but it's quite cumbersome:
print(df.col1.value_counts()[:50])
print(df.col1.value_counts()[50:100])
print(df.col1.value_counts()[100:150])
# etc.
Also, I have read this related Stack Overflow question, but haven't found it helpful.
So how to stop outputting truncated results?
If you want to print all rows:
pd.options.display.max_rows = 1000
print(vals)
If you want to print all rows only once:
with pd.option_context("display.max_rows", 1000):
print(vals)
Relevant documentation here.
I think you need option_context and set to some large number, e.g. 999. Advatage of solution is:
option_context context manager has been exposed through the top-level API, allowing you to execute code with given option values. Option values are restored automatically when you exit the with block.
#temporaly display 999 rows
with pd.option_context('display.max_rows', 999):
print (df.col1.value_counts())
I know other people have asked similar questions in past but I am still stuck on how to solve the problem and was hoping someone could offer some help. Using PsychoPy, I would like to present different images, specifically 16 emotional trials, 16 neutral trials and 16 face trials. I would like to pseudo randomize the loop such that there would not be more than 2 consecutive emotional trials. I created the experiment in Builder but compiled a script after reading through previous posts on pseudo randomization.
I have read the previous posts that suggest creating randomized excel files and using those, but considering how many trials I have, I think that would be too many and was hoping for some help with coding. I have tried to implement and tweak some of the code that has been posted for my experiment, but to no avail.
Does anyone have any advice for my situation?
Thank you,
Rae
Here's an approach that will always converge very quickly, given that you have 16 of each type and only reject runs of more than two emotion trials. #brittUWaterloo's suggestion to generate trials offline is very good--this what I do myself typically. (I like to have a small number of random orders, do them forward for some subjects and backwards for others, and prescreen them to make sure there are no weird or unintended juxtapositions.) But the algorithm below is certainly safe enough to do within an experiment if you prefer.
This first example assumes that you can represent a given trial using a string, such as 'e' for an emotion trial, 'n' neutral, 'f' face. This would work with 'emo', 'neut', 'face' as well, not just single letters, just change eee to emoemoemo in the code:
import random
trials = ['e'] * 16 + ['n'] * 16 + ['f'] * 16
while 'eee' in ''.join(trials):
random.shuffle(trials)
print trials
Here's a more general way of doing it, where the trial codes are not restricted to be strings (although they are strings here for illustration):
import random
def run_of_3(trials, obj):
# detect if there's a run of at least 3 objects 'obj'
for i in range(2, len(trials)):
if trials[i-2: i+1] == [obj] * 3:
return True
return False
tr = ['e'] * 16 + ['n'] * 16 + ['f'] * 16
while run_of_3(tr, 'e'):
random.shuffle(tr)
print tr
Edit: To create a PsychoPy-style conditions file from the trial list, just write the values into a file like this:
with open('emo_neu_face.csv', 'wb') as f:
f.write('stim\n') # this is a 'header' row
f.write('\n'.join(tr)) # these are the values
Then you can use that as a conditions file in a Builder loop in the regular way. You could also open this in Excel, and so on.
This is not quite right, but hopefully will give you some ideas. I think you could occassionally get caught in an infinite cycle in the elif statement if the last three items ended up the same, but you could add some sort of a counter there. In any case this shows a strategy you could adapt. Rather than put this in the experimental code, I would generate the trial sequence separately at the command line, and then save a successful output as a list in the experimental code to show to all participants, and know things wouldn't crash during an actual run.
import random as r
#making some dummy data
abc = ['f']*10 + ['e']*10 + ['d']*10
def f (l1,l2):
#just looking at the output to see how it works; can delete
print "l1 = " + str(l1)
print l2
if not l2:
#checks if second list is empty, if so, we are done
out = list(l1)
elif (l1[-1] == l1[-2] and l1[-1] == l2[0]):
#shuffling changes list in place, have to copy it to use it
r.shuffle(l2)
t = list(l2)
f (l1,t)
else:
print "i am here"
l1.append(l2.pop(0))
f(l1,l2)
return l1
You would then run it with something like newlist = f(abc[0:2],abc[2:-1])
I am new to integration and MuleSoft so I need your help. I have a flat file with different invoice line items per salesID, like this:
SalesOrderID OrderQty UnitPrice
43659 70 2024.994
43659 70 2024.994
43660 1 419.4589
43660 1 874.794
43661 1 809.76
I want to insert total invoice amount and quantity in another CSV file using Mule, something like this:
SalesOrderID OrderQty UnitPrice
43659 140 4049.988
43660 2 1294.4589
43661 1 809.76
I know how to do this in informatica,but im trying to figure out a way to do this in MuleSoft. How can I sum up all the line items and group them by SalesOrderID? Any help/clue will be really appreciated.
Thanks.
There are a number of ways to do this, let me explain my personal favourite assuming you are using Mule community edition.
Use the superb library: SuperCSV, it will allow you not only to parse the data (including those dot numbers) but also validate it and give back an exact report on why the file is broken in case it is.
This could be done in a transformer that transform the inbound stream and return either a map of maps (or even better a iterator that handles the whole thing but this is more difficult) or a custom exception with the error report.
Given that this requirement is one that faces Mule developers even today, it's useful to see a solution based on the Mule 4.x runtime and Dataweave 2.x.
If the data came from a file or is otherwise a monolithic stream of text, use the splitBy() function to get an array of text lines.
payload splitBy '\n'
remove the first line as the headers should not be calculated
payload[1 to -1] // this is my favorite way to do it
Now use the reduce() function to iterate over each of the lines in turn, updating the accumulator each time to account for quantity and price.
Hopefully that helps