Can't 'reverse' the months listed in DatePickerDialog - android-datepicker

I want to 'reverse' the months listed in the spinner displayed in the DatePickerDialog.
DatePickerDialog uses the widget, android.widget.DatePicker, to populate the day, the month and the year spinners. For example, when calling the private method setCurrentLocale() to popluate the month spinner, the widget populates its member variable, mShortMonths.
private void setCurrentLocale(Locale locale) {
:
:
:
mShortMonths = new DateFormatSymbols().getShortMonths();
:
You find that the class, DateFormatSymbols, returns the string array, "[Jan, Feb, Mar, Apr, May, Jun Jul, Aug, Sep, Oct, Nov, Dec]", from the device's default locale, Locale.getDefault()
I'm told, LocaleData.get() returns a shared LocaleData for the given locale. And so, I was thinking I could 'adjust' the shared local value by using the public member, LocaleData.shortMonthNames:
import libcore.icu.ICU;
import libcore.icu.LocaleData;
:
:
transient LocaleData localeData;
localeData = LocaleData.get(Locale.getDefault());
ArrayUtils.reverse(LocaleData.shortMonthNames);
With that, it's then hoped, when again populating the spinners, I will get the months in reverse:
mShortMonths = new DateFormatSymbols().getShortMonths();
"[Dec, Nov, Oct, Sep, Aug, Jul, Jun, May, Apr, Mar, Feb, Jan]"
When I try to implement this approach, I'm getting the error, 'The import libcore cannot be resolved' with the following import statements.
import libcore.icu.ICU;
import libcore.icu.LocaleData;
Why would these import statements produce this error when those very same import statements are working just fine while debugging and stepping through the class, DateFormatSymbols, for example??

Related

Upsert the table in bigquery with the condition

I have two tables A and B with key 'place', A contains 'place' values of December month and B contains 'place' values of January month. Need to create two columns (first seen and last seen) which state If 'place' in A present in B then first seen = Dec, last seen = March,
If 'place' in A not present in B then first seen = Dec, last seen = dec,
for new 'place 'added in B first seen = mar and last seen =march.

Python dataframe grouping

I'm trying to provide average movie’s ratings by the following four time intervals during which the movies were released (a) 1970 to 1979 (b) 1980 to 1989, ect.. and I wonder what did I wrong here, since I'm new to DS.
EDIT
Since the dataset have no year column, I extract the released year embedded in the title column and assign a new column to the dataset:
year = df['title'].str.findall('\((\d{4})\)').str.get(0)
year_df = df.assign(year = year.values)
1.5. Because there are some str in the column, I convert the entire "year" column to int. Then I implemented groupby function to group the year in 10 years interval.
year_df['year'] = year_df['year'].astype(int)
year_df = year_df.groupby(year_df.year // 10 * 10)
After that, I want to assign the year group into an interval of 10 years:
year_desc = { 1910: "1910 – 1019", 1920: "1920 – 1929", 1930: "1930 – 1939", 1940: "1940 – 1949", 1950: "1950 – 1959",1960: "1960 – 1969",1970: "1970 – 1979",1980: "1980 – 1989",1990: "1990 – 1999",2000: "2000 – 2009"}
year_df['year'] = [year_desc[x] for x in year_df['year']]
When I run my code after trying to assign year group, I get an error stated that:
TypeError: 'Series' objects are mutable, thus they cannot be hashed
UPDATES:
I tried to follow #ozacha suggestion and I still experiencing error, but this time is
'SeriesGroupBy' object has no attribute 'map'
Ad 1) Your year_df already has a year column, so there is no need to recreate it using df.assign(). .assign() is an alternative way of (re)defining columns in a dataframe.
Ad 2) Not sure what your test_group is, so it is difficult to get what's the source of the error. However, I believe this is what you want – using pd.Series.map:
year_df = ...
year_df['year'] = year_df['year'].astype(int)
year_desc = {...}
year_df['year_group'] = year_df['year'].map(year_desc)
Alternatively, you can also generate year groups dynamically:
year_df['year_group'] = year_df['year'].map(lambda year: f"{year} – {year + 9}")

Rolling up daily data to weekly with resample and offset in pandas

I have daily covid data that I would like to roll up to weekly. The problem is that I want my weeks to go from Sunday to Saturday, but the default is Mon to Sun. I tried to use loffset, but it's only changing the dates not my data, plus it's adding a date that does not exist in the dataset.
Code:
logic = {'iso_code' : 'first',
'new_cases' : 'sum',
'new_deaths' : 'sum',
'icu_patients' : 'sum',
'hosp_patients' : 'sum',
'people_vaccinated': 'sum'} #it's possible to have 'first', 'max','last', etc
offset = pd.offsets.DateOffset(-1)
df_covid_weekly = df_covid_file.resample('W', on='date', label = 'right', loffset=offset).apply(logic).reset_index()
Raw Data Snipet:
Current Outcome:
Expected Outcome:
Use anchored offsets:
df_covid_file.resample('W-SAT', on='date', label = 'right')
The offset W is equivalent to W-SUN ("week ending on Sunday") and W-SAT is "week ending on Saturday", and so on.
If you want an offset object you can use pd.offsets.Week(weekday=5), which is equivalent to W-SAT. The offset strings are aliases for these objects. Sometimes using the objects instead of their string counterparts makes code parametrization a little easier.

How to remove rows in Pandas DataFrame that are partial duplicates?

I have a DataFrame of scraped tweets, and I am trying to remove the rows of tweets that are partial duplicates.
Below is a simplified DataFrame with the same issue. Notice how the first and the last tweet have all but the attached url ending in common; I need a way to drop partial duplicates like this and only keep the latest instance.
data = {
'Tweets':[' The Interstate is closed www.txdot.com/closed',\
'The project is complete www.txdot.com/news',\
'The Interstate is closed www.txdot.com/news'],
'Date': ['Mon Aug 03 20:48:42', 'Mon Aug 03 20:15:42', 'Mon Aug 03 20:01:42' ]
}
df =pd.DataFrame(data)
I've tried dropping duplicates with the drop_duplicates method below, but there doesn't seem to an argument to accomplish this.
df.drop_duplicates(subset=['Tweets'])
Any ideas how to accomplish this?
you can write a regex to remove the slash identify each column by the main url portion and ignore the forward slash.
df['Tweets'].replace('(www\.\w+\.com)/(\w+)',r'\1',regex=True).drop_duplicates()
Yields
0 The Interstate is closed www.txdot.com
1 The project is complete www.txdot.com
Name: Tweets, dtype: object
we can pass the index and create a boolean filter.
df.loc[df['Tweets'].replace('(www\.\w+\.com)/(\w+)',r'\1',regex=True).drop_duplicates().index]
Tweets Date
0 The Interstate is closed www.txdot.com/closed Mon Aug 03 20:48:42
1 The project is complete www.txdot.com/news Mon Aug 03 20:15:42

How to construct a data frame from raw data from a CSV file

I am currently learning the python environment to process sensor data.
I have a board with 32 sensors reading temperature. At the following link, you can find an extract of the raw data: https://5e86ea3db5a86.htmlsave.net/
I am trying to construct a data frame grouped by date from my CSV file using pandas (see the potential structure of the table https://docs.google.com/spreadsheets/d/1zpDI7tp4nSn8-Hm3T_xd4Xz7MV6VDGcWGxwNO-8S0-s/edit?usp=sharing
So far, I have read the data file in pandas and delete all the unnamed columns. I am struggling with the creation of a column sensor ID which should contain the 32 sensor ID and the column temperature.
How should I loop through this CSV file to create 3 columns (date, sensor ID and temperature)?
Thanks for the help
It looks like the first item in each line is the date, then there are pairs of sensor id and value, then a blank value that we can exclude. If so, then the following should work. If not, try to modify the code to your purposes.
data = []
with open('filename.txt', 'r') as f:
for line in f:
# the if excludes empty strings
parts = [part for part in line.split(',') if part]
# this gets the date in a format that pandas can recognize
# you can omit the replace operations if not needed
sensor_date = parts[0].strip().replace('[', '').replace(']', '')
# the rest of the list are the parings of sensor and reading
sensor_readings = parts[1:]
# this uses list slicing to iterate over even and odd elements in list
# ::2 means every second item starting with zero, which are evens
# 1::2 means every second item starting with one, which are odds
for sensor, reading in zip(sensor_readings[::2], sensor_readings[1::2]):
data.append({'sensor_date': sensor_date,
'sensor': sensor,
'reading': reading})
pd.DataFrame(data)
Using your sample data, I got the following:
=== Output: ===
Out[64]:
sensor_date sensor reading
0 Tue Jul 02 16:35:22.782 2019 28C037080B000089 16.8750
1 Tue Jul 02 16:35:22.782 2019 284846080B000062 17.0000
2 Tue Jul 02 16:35:22.782 2019 28A4BA070B00002B 16.8750
3 Tue Jul 02 16:35:22.782 2019 28D4E3070B0000D5 16.9375
4 Tue Jul 02 16:35:22.782 2019 28A21E080B00002F 17.0000
.. ... ... ...