Greatest Small date - python

I have a two date columns let's say A and B in two separate tables. A contains the information of the date of test and column B contains date at which the factory was calibrated. I want to extract information of how many days has been passed since the factory was last calibrated.
For example:
A=['2020-02-26', '2020-02-27', '2020-02-28', '2020-02-29']
B=['2020-02-24', '2020-02-28']
Days_Passed since last calibration corresponding to A are [2,3,0,1]

Take the smallest date as reference 0 and convert other dates into days with respect to 0(smallest date)
A = [2,3,4,5]
B = [0,4]
for each value of A, perform a binary search to find the nearest smallest or equal value in B... Their subtraction will be the Days_Passed since the last calibration.
Answer Array = [2,3,0,1].

if dates in A and B be in order, this could be done in O(n+m) where n and m are the length of A and B. though you didn't mention about the programming language, this is the implementation in C#
the main part:
foreach (var testedDate in testedDates)
{
if (nextCalibratedDate.HasValue && (testedDate - nextCalibratedDate.Value).Days >= 0)
{
Console.WriteLine((testedDate - nextCalibratedDate.Value).Days);
calibratedDate = nextCalibratedDate.Value;
if (enumerator.MoveNext())
{
nextCalibratedDate = (DateTime?)enumerator.Current;
}
}
else
{
Console.WriteLine((testedDate - calibratedDate).Days);
}
}
and this is the complete code:
public static void Main(string[] args)
{
string[] A = new[] { "2020-02-26", "2020-02-27", "2020-02-28", "2020-02-29" };
string[] B = new[] { "2020-02-24", "2020-02-28" };
var testedDates = A
.Select(x => DateTime.Parse(x))
.ToArray();
var calibratedDates = B
.Select(x => DateTime.Parse(x))
.ToArray();
var enumerator = calibratedDates.GetEnumerator();
enumerator.MoveNext();
var calibratedDate = (DateTime)enumerator.Current;
DateTime? nextCalibratedDate = default;
if (enumerator.MoveNext())
{
nextCalibratedDate = (DateTime?)enumerator.Current;
}
foreach (var testedDate in testedDates)
{
if (nextCalibratedDate.HasValue && (testedDate - nextCalibratedDate.Value).Days >= 0)
{
Console.WriteLine((testedDate - nextCalibratedDate.Value).Days);
calibratedDate = nextCalibratedDate.Value;
if (enumerator.MoveNext())
{
nextCalibratedDate = (DateTime?)enumerator.Current;
}
}
else
{
Console.WriteLine((testedDate - calibratedDate).Days);
}
}
}

Related

Add rows to Google Sheets sheet if not enough

I have a bunch of data that I routinely use a Python script to back up into a Google Sheets sheet. It's currently at 5385 rows filled with 6041 total. I know that if I try to upload more than 6041 rows the update will fail, but I know that I can fix this by opening up the sheet, scrolling all the way to the bottom, and then clicking "All 1000 more rows at bottom" a few times.
Is there a way for googleapiclient to automatically make sure that there's room in the sheet?
Edit:
cells = 'Backup!A{}:{}{}'.format(start_ind, self._excel_column_index(len(headers)), start_ind + len(to_cache) + 1)
values = self._excel_serialize_arb_array(to_cache, headers)
data = {'values':values}
self.sheet.values().update(spreadsheetId=self.spreadsheet_ids['FOO'],
range=cells, valueInputOption='USER_ENTERED',
body=data).execute()
You would need to update the properties of the Sheet itself or create a sheet with the properties of more than 1000 rows.
Note: I build the code over apps script before your edit with your code.
/**
* Add a new sheet with some properties.
* #param {string} yourspreadsheetId The spreadsheet ID.
*/
// This funcion would add a sheet with 10,000 rows that would be empty and that they can be filled
function addSheet() {
const spreadsheetId = "yourspreeadsheetID";
var requests = [{
'addSheet': {
'properties': {
'title': 'Deposits',
'gridProperties': {
'rowCount': 100000,
'columnCount': 2
},
'tabColor': {
'red': 1.0,
'green': 0.3,
'blue': 0.4
}
}
}
}];
var response =
Sheets.Spreadsheets.batchUpdate({'requests': requests}, spreadsheetId);
Logger.log('Created sheet with ID: ' +
response.replies[0].addSheet.properties.sheetId);
}
// This function is to write all 10,000 rows bypassing the 1000 limit that you might be having
function myFunction() {
const spreadsheetId = "yourspreadsheetID";
// Spreadsheet ID.
const max = 10000;
const data = [];
for (let i = 0; i < max; i++) {
data.push({range: `Deposits!A${i + 1}`, values: [[`A${i + 1}`]]});
}
Sheets.Spreadsheets.Values.batchUpdate({data: data, valueInputOption: "USER_ENTERED"}, spreadsheetId);
}
I run the first function to create the Sheet with the properties using the "batchUpdate" and was able to add 10,000 strings to it.
I would assume that on Python you need to increase the number of rows in the sheet using an UpdateSheetPropertiesRequest or InsertDimensionRequest check it here: Sheet API update properties!
Here's my solution coded up.
def get_and_update_sheet_properties(self, min_rows):
''' Ensure we have enough rows for our arbs cache. '''
sized_sheets = self.sheet.get(spreadsheetId=self.spreadsheet_ids['FOO'],
fields="sheets(properties(title,gridProperties(columnCount,rowCount)))").execute()
arb_sheet = [sized_sheet for sized_sheet in sized_sheets['sheets'] if sized_sheet['properties']['title'] == 'BAR'][0]
cur_rows = arb_sheet['properties']['gridProperties']['rowCount']
padded_rows = min_rows - (min_rows % 100) + 200
if padded_rows > cur_rows:
requests = {
"updateSheetProperties": {
"properties": {
"sheetId": 1922941665,
'gridProperties': {
'rowCount': padded_rows,
'columnCount': 26
},
},
"fields": "gridProperties",
}
}
body = {
'requests': requests
}
self.sheet.batchUpdate(spreadsheetId=self.spreadsheet_ids['FOO'], body=body).execute()
return True
else:
return False

Kotlin set Array as key for a HashMap

I'm doing a bit of Leetcode, and I'm facing this issue: Group Anagrams, I have a Python background and I can do the following:
res = defaultdic(list)
count = [0] * 26
res[tuple(count)].append(s)
as we can see we can set the tupled array as the key for the dictionary, I want to do the same thing in Kotlin, however, when creating this in Kotlin, I get a different object every time when adding this logic in a for loop.
fun groupAnagrams(strs: Array<String>): List<List<String>> {
val hashMap = hashMapOf<IntArray, ArrayList<String>>()
for (word in strs) {
val array = IntArray(26) { 0 }
for (char in word) {
val charInt = char - 'a'
array[charInt] += 1
}
if (hashMap.containsKey(array)) {
hashMap[array]!!.add(word)
} else {
hashMap[array] = ArrayList<String>().apply { add(word) }
}
}
return hashMap.values.toList()
}
Is this something can be done in Kotlin?
Equality for IntArray is checked based on its reference. You can use a List here in place of IntArray. Two Lists are equal if they contain the same elements.
Modified code will be like this:
fun groupAnagrams(strs: Array<String>): List<List<String>> {
val hashMap = hashMapOf<List<Int>, ArrayList<String>>()
for (word in strs) {
val array = List(26) { 0 }.toMutableList()
for (char in word) {
val charInt = char - 'a'
array[charInt] += 1
}
if (hashMap.containsKey(array)) {
hashMap[array]!!.add(word)
} else {
hashMap[array] = ArrayList<String>().apply { add(word) }
}
}
return hashMap.values.toList()
}
Avoiding the problem you run into (equality of arrays) by using String keys:
fun groupAnagramsWithHashing(strs: Array<String>): List<List<String>> {
val map = hashMapOf<String, MutableList<String>>()
MessageDigest.getInstance("SHA-1").also { sha ->
for (word in strs) {
word.toByteArray().sorted().forEach { sha.update(it) }
val key = sha.digest().joinToString()
map.computeIfAbsent(key) { mutableListOf() }.add(word)
}
}
return map.values.toList()
}
fun main() {
val input = arrayOf("eat", "tea", "tan", "ate", "nat", "bat")
groupAnagramsWithHashing(input).also { println(it) }
// [[eat, tea, ate], [bat], [tan, nat]]
}

Reading large data from mongoDB in batches - Pymongo [duplicate]

I know that it is a bad practice to use skip in order to implement pagination, because when your data gets large skip starts to consume a lot of memory. One way to overcome this trouble is to use natural order by _id field:
//Page 1
db.users.find().limit(pageSize);
//Find the id of the last document in this page
last_id = ...
//Page 2
users = db.users.find({'_id'> last_id}). limit(10);
The problem is - I'm new to mongo and do not know what is the best way to get this very last_id
The concept you are talking about can be called "forward paging". A good reason for that is unlike using .skip() and .limit() modifiers this cannot be used to "go back" to a previous page or indeed "skip" to a specific page. At least not with a great deal of effort to store "seen" or "discovered" pages, so if that type of "links to page" paging is what you want, then you are best off sticking with the .skip() and .limit() approach, despite the performance drawbacks.
If it is a viable option to you to only "move forward", then here is the basic concept:
db.junk.find().limit(3)
{ "_id" : ObjectId("54c03f0c2f63310180151877"), "a" : 1, "b" : 1 }
{ "_id" : ObjectId("54c03f0c2f63310180151878"), "a" : 4, "b" : 4 }
{ "_id" : ObjectId("54c03f0c2f63310180151879"), "a" : 10, "b" : 10 }
Of course that's your first page with a limit of 3 items. Consider that now with code iterating the cursor:
var lastSeen = null;
var cursor = db.junk.find().limit(3);
while (cursor.hasNext()) {
var doc = cursor.next();
printjson(doc);
if (!cursor.hasNext())
lastSeen = doc._id;
}
So that iterates the cursor and does something, and when it is true that the last item in the cursor is reached you store the lastSeen value to the present _id:
ObjectId("54c03f0c2f63310180151879")
In your subsequent iterations you just feed that _id value which you keep ( in session or whatever ) to the query:
var cursor = db.junk.find({ "_id": { "$gt": lastSeen } }).limit(3);
while (cursor.hasNext()) {
var doc = cursor.next();
printjson(doc);
if (!cursor.hasNext())
lastSeen = doc._id;
}
{ "_id" : ObjectId("54c03f0c2f6331018015187a"), "a" : 1, "b" : 1 }
{ "_id" : ObjectId("54c03f0c2f6331018015187b"), "a" : 6, "b" : 6 }
{ "_id" : ObjectId("54c03f0c2f6331018015187c"), "a" : 7, "b" : 7 }
And the process repeats over and over until no more results can be obtained.
That's the basic process for a natural order such as _id. For something else it gets a bit more complex. Consider the following:
{ "_id": 4, "rank": 3 }
{ "_id": 8, "rank": 3 }
{ "_id": 1, "rank": 3 }
{ "_id": 3, "rank": 2 }
To split that into two pages sorted by rank then what you essentially need to know is what you have "already seen" and exclude those results. So looking at a first page:
var lastSeen = null;
var seenIds = [];
var cursor = db.junk.find().sort({ "rank": -1 }).limit(2);
while (cursor.hasNext()) {
var doc = cursor.next();
printjson(doc);
if ( lastSeen != null && doc.rank != lastSeen )
seenIds = [];
seenIds.push(doc._id);
if (!cursor.hasNext() || lastSeen == null)
lastSeen = doc.rank;
}
{ "_id": 4, "rank": 3 }
{ "_id": 8, "rank": 3 }
On the next iteration you want to be less or equal to the lastSeen "rank" score, but also excluding those already seen documents. You do this with the $nin operator:
var cursor = db.junk.find(
{ "_id": { "$nin": seenIds }, "rank": "$lte": lastSeen }
).sort({ "rank": -1 }).limit(2);
while (cursor.hasNext()) {
var doc = cursor.next();
printjson(doc);
if ( lastSeen != null && doc.rank != lastSeen )
seenIds = [];
seenIds.push(doc._id);
if (!cursor.hasNext() || lastSeen == null)
lastSeen = doc.rank;
}
{ "_id": 1, "rank": 3 }
{ "_id": 3, "rank": 2 }
How many "seenIds" you actually hold on to depends on how "granular" your results are where that value is likely to change. In this case you can check if the current "rank" score is not equal to the lastSeen value and discard the present seenIds content so it does not grow to much.
That's the basic concepts of "forward paging" for you to practice and learn.
The simplest way to implement pagination in MongoDB
// Pagination
const page = parseInt(req.query.page, 10) || 1;
const limit = parseInt(req.query.limit, 10) || 25;
const startIndex = (page - 1) * limit;
const endIndex = page * limit;
query = query.skip(startIndex).limit(limit);

hex/binary string conversion in Swift

Python has two very useful library method (binascii.a2b_hex(keyStr) and binascii.hexlify(keyBytes)) which I have been struggling with in Swift. Is there anything readily available in Swift. If not, how would one implement it? Given all the bounds and other checks (like even-length key) are done.
Data from Swift 3 has no "built-in" method to print its contents as
a hex string, or to create a Data value from a hex string.
"Data to hex string" methods can be found e.g. at How to convert Data to hex string in swift or How can I print the content of a variable of type Data using Swift? or converting String to Data in swift 3.0. Here is an implementation from the first link:
extension Data {
func hexEncodedString() -> String {
return map { String(format: "%02hhx", $0) }.joined()
}
}
Here is a possible implementation of the reverse "hex string to Data"
conversion (taken from Hex String to Bytes (NSData) on Code Review, translated to Swift 3 and improved)
as a failable inititializer:
extension Data {
init?(fromHexEncodedString string: String) {
// Convert 0 ... 9, a ... f, A ...F to their decimal value,
// return nil for all other input characters
func decodeNibble(u: UInt8) -> UInt8? {
switch(u) {
case 0x30 ... 0x39:
return u - 0x30
case 0x41 ... 0x46:
return u - 0x41 + 10
case 0x61 ... 0x66:
return u - 0x61 + 10
default:
return nil
}
}
self.init(capacity: string.utf8.count/2)
var iter = string.utf8.makeIterator()
while let c1 = iter.next() {
guard
let val1 = decodeNibble(u: c1),
let c2 = iter.next(),
let val2 = decodeNibble(u: c2)
else { return nil }
self.append(val1 << 4 + val2)
}
}
}
Example:
// Hex string to Data:
if let data = Data(fromHexEncodedString: "0002468A13579BFF") {
let idata = Data(data.map { 255 - $0 })
// Data to hex string:
print(idata.hexEncodedString()) // fffdb975eca86400
} else {
print("invalid hex string")
}
Not really familiar with Python and the checks it performs when convert the numbers, but you can expand the function below:
func convert(_ str: String, fromRadix r1: Int, toRadix r2: Int) -> String? {
if let num = Int(str, radix: r1) {
return String(num, radix: r2)
} else {
return nil
}
}
convert("11111111", fromRadix: 2, toRadix: 16)
convert("ff", fromRadix: 16, toRadix: 2)
Swift 2
extension NSData {
class func dataFromHexString(hex: String) -> NSData? {
let regex = try! NSRegularExpression(pattern: "^[0-9a-zA-Z]*$", options: .CaseInsensitive)
let validate = regex.firstMatchInString(hex, options: NSMatchingOptions.init(rawValue: 0), range: NSRange(location: 0, length: hex.characters.count))
if validate == nil || hex.characters.count % 2 != 0 {
return nil
}
let data = NSMutableData()
for i in 0..<hex.characters.count/2 {
let hexStr = hex.substring(i * 2, length: 2)
var ch: UInt32 = 0
NSScanner(string: hexStr).scanHexInt(&ch)
data.appendBytes(&ch, length: 1)
}
return data
}
}
let a = 0xabcd1234
print(String(format: "%x", a)) // Hex to String
NSData.dataFromHexString("abcd1234") // String to hex

Hi, How do I run the equivalent using pymongo? cfg = rs.conf() db.printSlaveReplicationInfo()

1>How can I run the equivalent using pymongo?
a>
cfg = rs.conf()
b>
db.printSlaveReplicationInfo()
2>Using PyMongo, how can I get the details of other replica sets CLI output in the created clusters .
(Note:I have already successfully created cluster.Just I am writing a python script in primary to check the outputs of rs.conf() and db.printSlaveReplicationInfo() in all the replica sets inside cluster and parse the output.)
Any help on this regard is greatly appreciable.
The replica set configuration is stored in the "local" database in a collection called "system.replset" so the equivalent of rs.conf() would be db.system.replset.findOne() when db is local or its equivalent find_one() in Python.
db.printSlaveReplicationInfo() is a big more involved, but you can get all of that information in the local database as well.
It may be easier to get it via admin database command replSetGetStatus which returns a document containing oplog information for each member of the replica set, along with other details. Python MongoDB driver pymongo provides a method to run commands, so you can run it against the admin DB and parse out the output for information about where each member of the replica set is relative to the primary.
I'm basically going to give you a hint rather than directly answer it, because the full answer is to simply code it. But you probably are unaware that you can do this simple thing in the shell:
> db.printSlaveReplicationInfo
function () {
var startOptimeDate = null;
function getReplLag(st) {
assert( startOptimeDate , "how could this be null (getReplLag startOptimeDate)" );
print("\tsyncedTo: " + st.toString() );
var ago = (startOptimeDate-st)/1000;
var hrs = Math.round(ago/36)/100;
print("\t" + Math.round(ago) + " secs (" + hrs + " hrs) behind the primary ");
};
function getMaster(members) {
var found;
members.forEach(function(row) {
if (row.self) {
found = row;
return false;
}
});
if (found) {
return found;
}
};
function g(x) {
assert( x , "how could this be null (printSlaveReplicationInfo gx)" )
print("source: " + x.host);
if ( x.syncedTo ){
var st = new Date( DB.tsToSeconds( x.syncedTo ) * 1000 );
getReplLag(st);
}
else {
print( "\tdoing initial sync" );
}
};
function r(x) {
assert( x , "how could this be null (printSlaveReplicationInfo rx)" );
if ( x.state == 1 || x.state == 7 ) { // ignore primaries (1) and arbiters (7)
return;
}
print("source: " + x.name);
if ( x.optime ) {
getReplLag(x.optimeDate);
}
else {
print( "\tno replication info, yet. State: " + x.stateStr );
}
};
var L = this.getSiblingDB("local");
if (L.system.replset.count() != 0) {
var status = this.adminCommand({'replSetGetStatus' : 1});
startOptimeDate = getMaster(status.members).optimeDate;
status.members.forEach(r);
}
else if( L.sources.count() != 0 ) {
startOptimeDate = new Date();
L.sources.find().forEach(g);
}
else {
print("local.sources is empty; is this db a --slave?");
return;
}
}
I love REPL's, and much like python's own famous REPL you can just get a dump of what the implemented function does.
Simples.

Categories