pandas .rolling().mean() analog in C#

pandas .rolling().mean() analog in C# - python

I'm trying to convert the following python code which calculates ATR using EMA into C#.
def calc_atr(df, high, low, close, timeperiod=14):
df['H_L'] = df[high] - df[low]
df['H_Cp'] = abs(df[high] - df[close].shift(1))
df['L_Cp'] = abs(df[low] - df[close].shift(1))
df['TR'] = df[["H_L", "H_Cp", "L_Cp"]].max(axis=1)
df['ATR'] = df['TR'].rolling(timeperiod).mean()
for i in range(timeperiod , len(df)):
df.loc[i, 'ATR'] = (df.loc[i - 1, 'ATR'] * (timeperiod -1) + df.loc[i, 'TR']) / timeperiod
return df
This is my attempt but I'm not doing the rolling window mean correctly. I think there was a way with LINQ, but I'm not sure how.
public static void CalcAtr(this List<Candle> source, int period = 14)
{
var highs = source.Select(e => e.High).ToArray();
var lows = source.Select(e => e.Low).ToArray();
var closes = source.Select(e => e.Close).ToArray();
var atr = new decimal[source.Count];
for (int i = period; i < source.Count; i++)
{
var hl = highs[i] - lows[i];
var hcp = Math.Abs(highs[i] - closes[i - 1]);
var lcp = Math.Abs(lows[i] - closes[i - 1]);
var tr = Math.Max(hl, Math.Max(hcp, lcp));
atr[i] = (atr[i - 1] * (period - 1) + tr) / period;
}
}

Related

Calculate breakout bands of Elliott wave oscillator

I am trying to calculate the values for the Elliott oscillator breakbands in python. I have the indicator logic in C#, which I will leave below.
In python I have already calculated and checked the value of the histogram, on which depends the calculation of the breakbands I want to calculate.
However, the values that the current logic gives are not the correct ones and I can't find what is the fault, since as I mentioned, I am replicating an existing logic in another language.
Actual logic in Python:
# For LowerBand, this return directly 0:
osc_fast = 5
osc_slow = 35
lens = osc_fast + osc_slow
pr = 2 / lens
strenght = 100
df = df.assign(Lwr_Line=0)
df['Lwr_Line'] = np.where(df["EWO_Std"] < 0, (df['EWO_Std']*pr) + (df['Lwr_Line'].shift(1)*(1-pr)), df['Lwr_Line'].shift(1))
df['LineEWOLwr'] = strenght / 100 * df['Lwr_Line']
df.drop(columns='Lwr_Line', inplace=True)
# For UpperBand:
df = df.assign(Upr_Line=0)
df['Upr_Line'] = np.where(df["EWO_Std"] > 0, (df['EWO_Std']*pr) + (df['Upr_Line'].shift(1)*(1-pr)), df['Upr_Line'].shift(1))
df['LineEWOUpr'] = strenght / 100 * df['Upr_Line']
df.drop(columns='Upr_Line', inplace=True)
Logic checked in C#:
{
MP[0] = ( High[0] + Low[0] ) / 2;
UprLine[0] = 0;
LwrLine[0] = 0;
Lens = OscFast + OscSlow;
Pr = 2.0/Lens;
if(CurrentBar < OscSlow){
OscAG = 0;
if (OscAG > 0){
OscAGUpr[0] = OscAG;
if (OscAGUpr[0] > OscAGUpr[1]){
OscAGUprDiv[0] = OscAG;
}
}
else{
OscAGLwr[0] = OscAG;
OscAGLwrDiv[0] = OscAG;
}
}
else{
OscAG = SMA(MP,OscFast)[0] - SMA(MP,OscSlow)[0];
if (OscAG > 0){
UprLine[0] = (OscAG*Pr) + (UprLine[1]*(1-Pr));
LwrLine[0] = LwrLine[1];
OscAGUpr[0] = OscAG;
if (OscAGUpr[0] > OscAGUpr[1])
{
OscAGUprDiv[0] = OscAG;
}
}
else{
UprLine[0] = UprLine[1];
LwrLine[0] = (OscAG*Pr) + (LwrLine[1]*(1-Pr));
OscAGLwr[0] = OscAG;
if (OscAGLwr[0] > OscAGLwr[1])
{
OscAGLwrDiv[0] = OscAG;
}
}
}
LineEWOUpr[0] = BOBStrength / 100 * UprLine[0];
LineEWOLwr[0] = BOBStrength / 100 * LwrLine[0];
}
Does anyone know what the error could be?
Thanks!
I'm tried many combinations, but no one works

Is there a way to create a file with python, containing accentuated letters, and read that with C#?

I can create files with C#, with accentuated letters in their name and content:
string itemtoorder = itemorderitemlist.SelectedItem.ToString();
int amounttoorder = int.Parse(itemorderamount.Text);
if (unitselect.Text == "gramm")
{
amounttoorder /= 1000;
}
DateTime noww = DateTime.Now;
int result = DateTime.Compare(dateTimePicker2.Value, noww);
if (result <= 0)
{
notifyIcon1.BalloonTipText = "Nem adhatsz fel a múltba rendelést!";
notifyIcon1.Icon = new Icon("Error.ico");
}
else
{
string today = DateTime.Today.ToString().Replace(" ", "").Replace("0:00:00", "");
string selecteddateasstring = dateTimePicker2.Value.Date.ToString().Replace(" ", "").Replace("0:00:00", "");
string[] writethingie = { itemtoorder, amounttoorder.ToString(), unitselect.Text, today,
selecteddateasstring, "M" };
string path = "data\\itemorders\\" + selecteddateasstring + "_" + itemtoorder + ".txt";
if (File.Exists(path))
{
DialogResult dialogResult = MessageBox.Show("Már van ugyanilyen rendelés ugyanerre a napra.", "Hozzáadjam a rendelt mennyiséget?", MessageBoxButtons.YesNo);
if (dialogResult == DialogResult.Yes)
{
string[] dataaaa = File.ReadAllLines(path);
dataaaa[1] = (int.Parse(dataaaa[1]) + amounttoorder).ToString();
File.WriteAllLines(path, dataaaa);
notifyIcon1.BalloonTipText = "A rendelés sikeresen ki lett bővítve!";
}
else if (dialogResult == DialogResult.No)
{
notifyIcon1.BalloonTipText = "Rendelés elvetve!";
}
}
else
{
File.WriteAllLines(path, writethingie);
notifyIcon1.Icon = new Icon("order.ico");
notifyIcon1.BalloonTipText = "Sikeres rendelésfelvétel!";
}
}
I can read the file in C#, that I created. It all works.
The problem is that if I create test data with a python script:
from datetime import datetime
from random import randrange
from datetime import timedelta
import string
def randomitem():
itemlist = [
"Ananász kocka (fagyasztott)",
"Barna rizsliszt",
"Barnarizs - Nagykun (szárazon)",
"Csicseriborsó (szárazon)",
"Csirkecomb filé (Bőr nélkül)",
"Hámozott paradicsom konzerv - Petti",
"Kukorica (morzsolt, fagyasztott)",
"Marhacomb (fagyasztott)",
"Pritamin paprika (fagyasztott)",
"Sárgarépa csík (fagyasztott)",
"Sárgarépa karika (fagyasztott)",
"Sűrített paradicsom - Aranyfácán",
"Vörösbab (szárazon)",
"Vöröshagyma kocka (fagyasztott)",
"Zeller (fagyasztott, csíkozott)",
]
return(itemlist[randrange(len(itemlist))])
def randomitem2():
itemlist = [
"Ananász SŰRÍTMÉNY",
"Citrom SŰRÍTMÉNY",
"Kókuszolaj - Barco",
"Kókusztejszín - RealThai",
]
return(itemlist[randrange(len(itemlist))])
def random_date(start, end):
delta = end - start
int_delta = (delta.days * 24 * 60 * 60) + delta.seconds
random_second = randrange(int_delta)
return start + timedelta(seconds=random_second)
def randomamount():
return randrange(10,70)
for i in range(200):
d1 = datetime.strptime('2022.1.1.', '%Y.%m.%d.')
d2 = datetime.strptime('2023.12.31.', '%Y.%m.%d.')
thida = str(random_date(d1, d2))
d3 = datetime.strptime('2021.4.1.', '%Y.%m.%d.')
d4 = datetime.strptime('2023.6.28.', '%Y.%m.%d.')
thirec = random_date(d3, d4)
if(i>160):
thisitem = randomitem2()
else:
thisitem = randomitem()
file = open(thida.replace("-",".").split()[0] + "._" + thisitem + ".txt" , "w")
file.write(thisitem + "\n")
file.write(str(randomamount()) + "\n")
if(i>160):
file.write("liter" + "\n")
else:
file.write("kilogramm" + "\n")
file.write(str(thirec + timedelta(days=4)).replace("-",".").split()[0] + "\n")
file.write(str(thirec).replace("-",".").replace(" ",".") + "\n")
file.write(thida.replace("-",".").replace(" ",".") + "\n")
Then the C# script can't read the accentuated letters.
Here's the reader:
string[] files = Directory.GetFiles("data/warehouse/");
string[] recipé = File.ReadAllLines("data/recipes/" + allines[0] + ".txt");
List<string> goodpaths = new List<string>();
List<int> goodams = new List<int>();
for (int i = 0; i < recipé.Length - 2; i += 2)
{
for (int j = 0; j < files.Length - 1; j++)
{
string cuf = Regex.Split(files[j], "._")[1];
cuf = cuf.Replace(".txt", "");
//Debug.WriteLine(recipé[i]);
//Debug.WriteLine(cuf);
if (recipé[i] == cuf)
{
//Debug.WriteLine("\n--==## Match ##==--\n");
goodpaths.Add(files[j]);
goodams.Add(int.Parse(recipé[i + 1]));
//Debug.WriteLine(files[j]);
}
}
goodpaths.Add("BREAK");
}
List<int> amos = new List<int>();
List<string> exps = new List<string>();
List<List<string>> UOT = new List<List<string>>();
string curitttttttt = "";
int counter = 0;
for (int i = 0; i < goodpaths.Count - 1; i++)
{
if (goodpaths[i] != "BREAK")
{
string[] thisisthecurrentfile = File.ReadAllLines(goodpaths[i]);
curitttttttt = thisisthecurrentfile[0];
//Debug.WriteLine(File.ReadAllLines(goodpaths[i])[0]);
amos.Add(int.Parse(thisisthecurrentfile[1]));
exps.Add(thisisthecurrentfile[5]);
}
else
{
int[] ams = amos.ToArray();
string[] exs = exps.ToArray();
int ned = goodams[counter];
int amo = int.Parse(allines[1]);
List<string>[] output = DATASET(ams, exs, ned, amo);
//Alapanyag neve
List<string> itnm = new List<string>();
itnm.Add(curitttttttt);
UOT.Add(itnm);
//Részletek
dataGridView1.ColumnCount = 1;
dataGridView1.RowCount = UOT.Count;
dataGridView1[0, counter].Value = UOT[counter][0];
counter++;
amos = new List<int>();
exps = new List<string>();
}
}
So to sum the problem:
I want to create test data with Python for a C# Windows Forms program, but the C# program can read only the files well if it creates and writes them. When the files are created by the Python script, the accentuated letters are displayed as question marks.

Louvain algorithm for graph clustering gives completely different result when running in Spark/Scala and Python, why is that happening?

I am running community detection in graphs made from telecom CDR data. First I was working with very dense graphs containing 10000 nodes, and the algorithm was producing 150 to 170 communities per graph. I was using Louvain community detection algorithm implemented in Scala for Spark.
When I try to run the same algorithm but implemented in C#, I get around 10 communities per graph. I also did some testing with smaller graph, containing around 300 nodes, and same thing occur. When I run it in Spark with Scala I get around 50 communities. When I run it in python or C# I get from 8 to 10 communities.
I am really surprised to see such difference. Every implementation that I used (Scala, Python or C#) is referring to the paper by VD Blondel https://arxiv.org/abs/0803.0476, so the algorithm should be the same, but the output is completely different. Did anyone experienced something like that when using Spark/Scala vs. python/c#?
This is how the main class Louvain is called:
import org.apache.spark.graphx.{Edge, Graph}
import org.apache.spark.{SparkContext, SparkConf}
import org.apache.log4j.{Level, Logger}
object Driver {
def main(args: Array[String]): Unit ={
val config = LouvainConfig(
"src/data/input/file_with_edges.csv", //input file
"src/data/output/", //output dir
1, //parallelism
2000, //minimumComplessionProgress
1, //progressCounter
",") //delimiter
val sc = new SparkContext("local[*]", "Louvain")
val louvain = new Louvain()
louvain.run(sc, config)
}
}
This is Scala implementation that I am using:
import scala.reflect.ClassTag
import com.esotericsoftware.kryo.io.{Input, Output}
import com.esotericsoftware.kryo.serializers.DefaultArraySerializers.ObjectArraySerializer
import com.esotericsoftware.kryo.{Kryo, KryoSerializable}
import org.apache.spark._
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
import org.apache.spark.SparkContext._
import org.apache.spark.graphx._
import org.apache.spark.broadcast.Broadcast
//import org.apache.spark.{Logging, SparkContext}
import org.apache.spark.{SparkContext}
class Louvain() extends Serializable{
def getEdgeRDD(sc: SparkContext, conf: LouvainConfig, typeConversionMethod: String => Long = _.toLong): RDD[Edge[Long]] = {
sc.textFile(conf.inputFile, conf.parallelism).map(row => {
val tokens = row.split(conf.delimiter).map(_.trim())
tokens.length match {
case 2 => new Edge(typeConversionMethod(tokens(0)),
typeConversionMethod(tokens(1)), 1L)
case 3 => new Edge(typeConversionMethod(tokens(0)),
typeConversionMethod(tokens(1)), tokens(2).toDouble.toLong)
case _ => throw new IllegalArgumentException("invalid input line: " + row)
}
})
}
/**
* Generates a new graph of type Graph[VertexState,Long] based on an
input graph of type.
* Graph[VD,Long]. The resulting graph can be used for louvain computation.
*
*/
def createLouvainGraph[VD: ClassTag](graph: Graph[VD, Long]):
Graph[LouvainData, Long] = {
val nodeWeights = graph.aggregateMessages(
(e:EdgeContext[VD,Long,Long]) => {
e.sendToSrc(e.attr)
e.sendToDst(e.attr)
},
(e1: Long, e2: Long) => e1 + e2
)
graph.outerJoinVertices(nodeWeights)((vid, data, weightOption) => {
val weight = weightOption.getOrElse(0L)
new LouvainData(vid, weight, 0L, weight, false)
}).partitionBy(PartitionStrategy.EdgePartition2D).groupEdges(_ + _)
}
/**
* Creates the messages passed between each vertex to convey
neighborhood community data.
*/
def sendCommunityData(e: EdgeContext[LouvainData, Long, Map[(Long, Long), Long]]) = {
val m1 = (Map((e.srcAttr.community, e.srcAttr.communitySigmaTot) -> e.attr))
val m2 = (Map((e.dstAttr.community, e.dstAttr.communitySigmaTot) -> e.attr))
e.sendToSrc(m2)
e.sendToDst(m1)
}
/**
* Merge neighborhood community data into a single message for each vertex
*/
def mergeCommunityMessages(m1: Map[(Long, Long), Long], m2: Map[(Long, Long), Long]) = {
val newMap = scala.collection.mutable.HashMap[(Long, Long), Long]()
m1.foreach({ case (k, v) =>
if (newMap.contains(k)) newMap(k) = newMap(k) + v
else newMap(k) = v
})
m2.foreach({ case (k, v) =>
if (newMap.contains(k)) newMap(k) = newMap(k) + v
else newMap(k) = v
})
newMap.toMap
}
/**
* Returns the change in modularity that would result from a vertex
moving to a specified community.
*/
def q(
currCommunityId: Long,
testCommunityId: Long,
testSigmaTot: Long,
edgeWeightInCommunity: Long,
nodeWeight: Long,
internalWeight: Long,
totalEdgeWeight: Long): BigDecimal = {
val isCurrentCommunity = currCommunityId.equals(testCommunityId)
val M = BigDecimal(totalEdgeWeight)
val k_i_in_L = if (isCurrentCommunity) edgeWeightInCommunity + internalWeight else edgeWeightInCommunity
val k_i_in = BigDecimal(k_i_in_L)
val k_i = BigDecimal(nodeWeight + internalWeight)
val sigma_tot = if (isCurrentCommunity) BigDecimal(testSigmaTot) - k_i else BigDecimal(testSigmaTot)
var deltaQ = BigDecimal(0.0)
if (!(isCurrentCommunity && sigma_tot.equals(BigDecimal.valueOf(0.0)))) {
deltaQ = k_i_in - (k_i * sigma_tot / M)
//println(s" $deltaQ = $k_i_in - ( $k_i * $sigma_tot / $M")
}
deltaQ
}
/**
* Join vertices with community data form their neighborhood and
select the best community for each vertex to maximize change in
modularity.
* Returns a new set of vertices with the updated vertex state.
*/
def louvainVertJoin(
louvainGraph: Graph[LouvainData, Long],
msgRDD: VertexRDD[Map[(Long, Long), Long]],
totalEdgeWeight: Broadcast[Long],
even: Boolean) = {
// innerJoin[U, VD2](other: RDD[(VertexId, U)])(f: (VertexId, VD, U) => VD2): VertexRDD[VD2]
louvainGraph.vertices.innerJoin(msgRDD)((vid, louvainData, communityMessages) => {
var bestCommunity = louvainData.community
val startingCommunityId = bestCommunity
var maxDeltaQ = BigDecimal(0.0);
var bestSigmaTot = 0L
// VertexRDD[scala.collection.immutable.Map[(Long, Long),Long]]
// e.g. (1,Map((3,10) -> 2, (6,4) -> 2, (2,8) -> 2, (4,8) -> 2, (5,8) -> 2))
// e.g. communityId:3, sigmaTotal:10, communityEdgeWeight:2
communityMessages.foreach({ case ((communityId, sigmaTotal), communityEdgeWeight) =>
val deltaQ = q(
startingCommunityId,
communityId,
sigmaTotal,
communityEdgeWeight,
louvainData.nodeWeight,
louvainData.internalWeight,
totalEdgeWeight.value)
//println(" communtiy: "+communityId+" sigma:"+sigmaTotal+"
//edgeweight:"+communityEdgeWeight+" q:"+deltaQ)
if (deltaQ > maxDeltaQ || (deltaQ > 0 && (deltaQ == maxDeltaQ &&
communityId > bestCommunity))) {
maxDeltaQ = deltaQ
bestCommunity = communityId
bestSigmaTot = sigmaTotal
}
})
// only allow changes from low to high communties on even cyces and
// high to low on odd cycles
if (louvainData.community != bestCommunity && ((even &&
louvainData.community > bestCommunity) || (!even &&
louvainData.community < bestCommunity))) {
//println(" "+vid+" SWITCHED from "+vdata.community+" to "+bestCommunity)
louvainData.community = bestCommunity
louvainData.communitySigmaTot = bestSigmaTot
louvainData.changed = true
}
else {
louvainData.changed = false
}
if (louvainData == null)
println("vdata is null: " + vid)
louvainData
})
}
def louvain(
sc: SparkContext,
graph: Graph[LouvainData, Long],
minProgress: Int = 1,
progressCounter: Int = 1): (Double, Graph[LouvainData, Long], Int) = {
var louvainGraph = graph.cache()
val graphWeight = louvainGraph.vertices.map(louvainVertex => {
val (vertexId, louvainData) = louvainVertex
louvainData.internalWeight + louvainData.nodeWeight
}).reduce(_ + _)
val totalGraphWeight = sc.broadcast(graphWeight)
println("totalEdgeWeight: " + totalGraphWeight.value)
// gather community information from each vertex's local neighborhood
var communityRDD =
louvainGraph.aggregateMessages(sendCommunityData, mergeCommunityMessages)
var activeMessages = communityRDD.count() //materializes the msgRDD
//and caches it in memory
var updated = 0L - minProgress
var even = false
var count = 0
val maxIter = 100000
var stop = 0
var updatedLastPhase = 0L
do {
count += 1
even = !even
// label each vertex with its best community based on neighboring
// community information
val labeledVertices = louvainVertJoin(louvainGraph, communityRDD,
totalGraphWeight, even).cache()
// calculate new sigma total value for each community (total weight
// of each community)
val communityUpdate = labeledVertices
.map({ case (vid, vdata) => (vdata.community, vdata.nodeWeight +
vdata.internalWeight)})
.reduceByKey(_ + _).cache()
// map each vertex ID to its updated community information
val communityMapping = labeledVertices
.map({ case (vid, vdata) => (vdata.community, vid)})
.join(communityUpdate)
.map({ case (community, (vid, sigmaTot)) => (vid, (community, sigmaTot))})
.cache()
// join the community labeled vertices with the updated community info
val updatedVertices = labeledVertices.join(communityMapping).map({
case (vertexId, (louvainData, communityTuple)) =>
val (community, communitySigmaTot) = communityTuple
louvainData.community = community
louvainData.communitySigmaTot = communitySigmaTot
(vertexId, louvainData)
}).cache()
updatedVertices.count()
labeledVertices.unpersist(blocking = false)
communityUpdate.unpersist(blocking = false)
communityMapping.unpersist(blocking = false)
val prevG = louvainGraph
louvainGraph = louvainGraph.outerJoinVertices(updatedVertices)((vid, old, newOpt) => newOpt.getOrElse(old))
louvainGraph.cache()
// gather community information from each vertex's local neighborhood
val oldMsgs = communityRDD
communityRDD = louvainGraph.aggregateMessages(sendCommunityData, mergeCommunityMessages).cache()
activeMessages = communityRDD.count() // materializes the graph
// by forcing computation
oldMsgs.unpersist(blocking = false)
updatedVertices.unpersist(blocking = false)
prevG.unpersistVertices(blocking = false)
// half of the communites can swtich on even cycles and the other half
// on odd cycles (to prevent deadlocks) so we only want to look for
// progess on odd cycles (after all vertcies have had a chance to
// move)
if (even) updated = 0
updated = updated + louvainGraph.vertices.filter(_._2.changed).count
if (!even) {
println(" # vertices moved: " + java.text.NumberFormat.getInstance().format(updated))
if (updated >= updatedLastPhase - minProgress) stop += 1
updatedLastPhase = updated
}
} while (stop <= progressCounter && (even || (updated > 0 && count < maxIter)))
println("\nCompleted in " + count + " cycles")
// Use each vertex's neighboring community data to calculate the
// global modularity of the graph
val newVertices =
louvainGraph.vertices.innerJoin(communityRDD)((vertexId, louvainData,
communityMap) => {
// sum the nodes internal weight and all of its edges that are in
// its community
val community = louvainData.community
var accumulatedInternalWeight = louvainData.internalWeight
val sigmaTot = louvainData.communitySigmaTot.toDouble
def accumulateTotalWeight(totalWeight: Long, item: ((Long, Long), Long)) = {
val ((communityId, sigmaTotal), communityEdgeWeight) = item
if (louvainData.community == communityId)
totalWeight + communityEdgeWeight
else
totalWeight
}
accumulatedInternalWeight = communityMap.foldLeft(accumulatedInternalWeight)(accumulateTotalWeight)
val M = totalGraphWeight.value
val k_i = louvainData.nodeWeight + louvainData.internalWeight
val q = (accumulatedInternalWeight.toDouble / M) - ((sigmaTot * k_i) / math.pow(M, 2))
//println(s"vid: $vid community: $community $q = ($k_i_in / $M) - ( ($sigmaTot * $k_i) / math.pow($M, 2) )")
if (q < 0)
0
else
q
})
val actualQ = newVertices.values.reduce(_ + _)
// return the modularity value of the graph along with the
// graph. vertices are labeled with their community
(actualQ, louvainGraph, count / 2)
}
def compressGraph(graph: Graph[LouvainData, Long], debug: Boolean = true): Graph[LouvainData, Long] = {
// aggregate the edge weights of self loops. edges with both src and dst in the same community.
// WARNING can not use graph.mapReduceTriplets because we are mapping to new vertexIds
val internalEdgeWeights = graph.triplets.flatMap(et => {
if (et.srcAttr.community == et.dstAttr.community) {
Iterator((et.srcAttr.community, 2 * et.attr)) // count the weight from both nodes
}
else Iterator.empty
}).reduceByKey(_ + _)
// aggregate the internal weights of all nodes in each community
val internalWeights = graph.vertices.values.map(vdata =>
(vdata.community, vdata.internalWeight))
.reduceByKey(_ + _)
// join internal weights and self edges to find new interal weight of each community
val newVertices = internalWeights.leftOuterJoin(internalEdgeWeights).map({ case (vid, (weight1, weight2Option)) =>
val weight2 = weight2Option.getOrElse(0L)
val state = new LouvainData()
state.community = vid
state.changed = false
state.communitySigmaTot = 0L
state.internalWeight = weight1 + weight2
state.nodeWeight = 0L
(vid, state)
}).cache()
// translate each vertex edge to a community edge
val edges = graph.triplets.flatMap(et => {
val src = math.min(et.srcAttr.community, et.dstAttr.community)
val dst = math.max(et.srcAttr.community, et.dstAttr.community)
if (src != dst) Iterator(new Edge(src, dst, et.attr))
else Iterator.empty
}).cache()
// generate a new graph where each community of the previous graph is
// now represented as a single vertex
val compressedGraph = Graph(newVertices, edges)
.partitionBy(PartitionStrategy.EdgePartition2D).groupEdges(_ + _)
// calculate the weighted degree of each node
val nodeWeights = compressedGraph.aggregateMessages(
(e:EdgeContext[LouvainData,Long,Long]) => {
e.sendToSrc(e.attr)
e.sendToDst(e.attr)
},
(e1: Long, e2: Long) => e1 + e2
)
// fill in the weighted degree of each node
// val louvainGraph = compressedGraph.joinVertices(nodeWeights)((vid,data,weight)=> {
val louvainGraph = compressedGraph.outerJoinVertices(nodeWeights)((vid, data, weightOption) => {
val weight = weightOption.getOrElse(0L)
data.communitySigmaTot = weight + data.internalWeight
data.nodeWeight = weight
data
}).cache()
louvainGraph.vertices.count()
louvainGraph.triplets.count() // materialize the graph
newVertices.unpersist(blocking = false)
edges.unpersist(blocking = false)
println("******************************************************")
println (louvainGraph.vertices.count())
louvainGraph
}
def saveLevel(
sc: SparkContext,
config: LouvainConfig,
level: Int,
qValues: Array[(Int, Double)],
graph: Graph[LouvainData, Long]) = {
val vertexSavePath = config.outputDir + "/level_" + level + "_vertices"
val edgeSavePath = config.outputDir + "/level_" + level + "_edges"
// save
graph.vertices.saveAsTextFile(vertexSavePath)
graph.edges.saveAsTextFile(edgeSavePath)
// overwrite the q values at each level
sc.parallelize(qValues, 1).saveAsTextFile(config.outputDir + "/qvalues_" + level)
}
//def run[VD: ClassTag](sc: SparkContext, config: LouvainConfig, graph: Graph[VD, Long]): Unit = {
def run[VD: ClassTag](sc: SparkContext, config: LouvainConfig): Unit = {
val edgeRDD = getEdgeRDD(sc, config)
val initialGraph = Graph.fromEdges(edgeRDD, None)
var louvainGraph = createLouvainGraph(initialGraph)
var compressionLevel = -1 // number of times the graph has been compressed
var q_modularityValue = -1.0 // current modularity value
var halt = false
var qValues: Array[(Int, Double)] = Array()
do {
compressionLevel += 1
println(s"\nStarting Louvain level $compressionLevel")
// label each vertex with its best community choice at this level of compression
val (currentQModularityValue, currentGraph, numberOfPasses) =
louvain(sc, louvainGraph, config.minimumCompressionProgress, config.progressCounter)
louvainGraph.unpersistVertices(blocking = false)
louvainGraph = currentGraph
println(s"qValue: $currentQModularityValue")
qValues = qValues :+ ((compressionLevel, currentQModularityValue))
saveLevel(sc, config, compressionLevel, qValues, louvainGraph)
// If modularity was increased by at least 0.001 compress the graph and repeat
// halt immediately if the community labeling took less than 3 passes
//println(s"if ($passes > 2 && $currentQ > $q + 0.001 )")
if (numberOfPasses > 2 && currentQModularityValue > q_modularityValue + 0.001) {
q_modularityValue = currentQModularityValue
louvainGraph = compressGraph(louvainGraph)
}
else {
halt = true
}
} while (!halt)
//finalSave(sc, compressionLevel, q_modularityValue, louvainGraph)
}
}
The code is taken from github https://github.com/athinggoingon/louvain-modularity.
Here is the example of input file, just first 10 lines. The graph is made from csv file, schema is : node1, node2, weight_of_the_edge
104,158,34.23767571520276
146,242,12.49338107205348
36,37,0.6821403413414481
28,286,2.5053934980726456
9,92,0.34412941554076487
222,252,10.502288293870677
235,282,0.25717021769814874
264,79,18.555996343792327
24,244,1.7094102023399587
231,75,21.698401383558213

CUDA: does size of input/output data have to be a multiple of the number of threads per block?

I have a Python code (for implementing RayTracing) that I'm running in parallel with PyCuda.
import pycuda.driver as drv
import pycuda.autoinit
from pycuda.compiler import SourceModule
import numpy as np
from stl import mesh
import time
my_mesh = mesh.Mesh.from_file('test_solid_py.stl')
n = my_mesh.normals
v0 = my_mesh.v0
v1 = my_mesh.v1
v2 = my_mesh.v2
v0_x = v0[:,0]
v0_x = np.ascontiguousarray(v0_x)
v0_y = v0[:,1]
v0_y = np.ascontiguousarray(v0_y)
v0_z = v0[:,2]
v0_z = np.ascontiguousarray(v0_z)
v1_x = v1[:,0]
v1_x = np.ascontiguousarray(v1_x)
v1_y = v1[:,1]
v1_y = np.ascontiguousarray(v1_y)
v1_z = v1[:,2]
v1_z = np.ascontiguousarray(v1_z)
v2_x = v2[:,0]
v2_x = np.ascontiguousarray(v2_x)
v2_y = v2[:,1]
v2_y = np.ascontiguousarray(v2_y)
v2_z = v2[:,2]
v2_z = np.ascontiguousarray(v2_z)
mod = SourceModule("""
#include <math.h>
__global__ void intersect(float *origin,float *dir_x,float *dir_y,float *dir_z,float *v0_x,float *v0_y,float *v0_z,float *v1_x,float *v1_y,float *v1_z,float *v2_x,float *v2_y,float *v2_z,float *int_point_real_x, float *int_point_real_y,float *int_point_real_z)
{
using namespace std;
unsigned int idx = blockDim.x*blockIdx.x + threadIdx.x;
int count = 0;
float v0_current[3];
float v1_current[3];
float v2_current[3];
float dir_current[3] = {dir_x[idx],dir_y[idx],dir_z[idx]};
float int_point[3];
float int_pointS[2][3];
int int_faces[2];
float dist[2];
dist[0] = -999;
int n_tri = 105500;
for(int i = 0; i<n_tri; i++) {
v0_current[0] = v0_x[i];
v0_current[1] = v0_y[i];
v0_current[2] = v0_z[i];
v1_current[0] = v1_x[i];
v1_current[1] = v1_y[i];
v1_current[2] = v1_z[i];
v2_current[0] = v2_x[i];
v2_current[1] = v2_y[i];
v2_current[2] = v2_z[i];
double eps = 0.0000001;
float E1[3];
float E2[3];
float s[3];
for (int j = 0; j < 3; j++) {
E1[j] = v1_current[j] - v0_current[j];
E2[j] = v2_current[j] - v0_current[j];
s[j] = origin[j] - v0_current[j];
}
float h[3];
h[0] = dir_current[1] * E2[2] - dir_current[2] * E2[1];
h[1] = -(dir_current[0] * E2[2] - dir_current[2] * E2[0]);
h[2] = dir_current[0] * E2[1] - dir_current[1] * E2[0];
float a;
a = E1[0] * h[0] + E1[1] * h[1] + E1[2] * h[2];
if (a > -eps && a < eps) {
int_point[0] = false;
}
else {
double f = 1 / a;
float u;
u = f * (s[0] * h[0] + s[1] * h[1] + s[2] * h[2]);
if (u < 0 || u > 1) {
int_point[0] = false;
}
else {
float q[3];
q[0] = s[1] * E1[2] - s[2] * E1[1];
q[1] = -(s[0] * E1[2] - s[2] * E1[0]);
q[2] = s[0] * E1[1] - s[1] * E1[0];
float v;
v = f * (dir_current[0] * q[0] + dir_current[1] * q[1] + dir_current[2] * q[2]);
if (v < 0 || (u + v)>1) {
int_point[0] = false;
}
else {
float t;
t = f * (E2[0] * q[0] + E2[1] * q[1] + E2[2] * q[2]);
if (t > eps) {
for (int j = 0; j < 3; j++) {
int_point[j] = origin[j] + dir_current[j] * t;
}
//return t;
}
}
}
}
if (int_point[0] != false) {
count = count+1;
int_faces[count-1] = i;
dist[count-1] = sqrt(pow((origin[0] - int_point[0]), 2) + pow((origin[1] - int_point[1]), 2) + pow((origin[2] - int_point[2]), 2));
for (int j = 0; j<3; j++) {
int_pointS[count-1][j] = int_point[j];
}
}
}
double min = dist[0];
int ind_min = 0;
for (int i = 0; i < 2; i++){
if (min > dist[i]) {
min = dist[i];
ind_min = i;
}
}
if (dist[0] < -998){
int_point_real_x[idx] = -999;
int_point_real_y[idx] = -999;
int_point_real_z[idx] = -999;
}
else{
int_point_real_x[idx] = int_pointS[ind_min][0];
int_point_real_y[idx] = int_pointS[ind_min][1];
int_point_real_z[idx] = int_pointS[ind_min][2];
}
}
""")
n_rays = 20000
num_threads = 1024
num_blocks = int(n_rays/num_threads)
origin = np.asarray([-2, -2, -2]).astype(np.float32)
origin = np.ascontiguousarray(origin)
rand_x = np.random.randn(n_rays)
rand_y = np.random.randn(n_rays)
rand_z = np.random.randn(n_rays)
direction_x = np.ones((n_rays, 1)) * 3
direction_x = direction_x.astype(np.float32)
direction_x = np.ascontiguousarray(direction_x)
direction_y = np.ones((n_rays, 1)) * 4
direction_y = direction_y.astype(np.float32)
direction_y = np.ascontiguousarray(direction_y)
direction_z = np.ones((n_rays, 1)) * 5
direction_z = direction_z.astype(np.float32)
direction_z = np.ascontiguousarray(direction_z)
int_point_real_x = np.zeros((n_rays, 1)).astype(np.float32)
int_point_real_x = np.ascontiguousarray(int_point_real_x)
int_point_real_y = np.zeros((n_rays, 1)).astype(np.float32)
int_point_real_y = np.ascontiguousarray(int_point_real_y)
int_point_real_z = np.zeros((n_rays, 1)).astype(np.float32)
int_point_real_z = np.ascontiguousarray(int_point_real_z)
intersect = mod.get_function("intersect")
start = time.time()
intersect(drv.In(origin), drv.In(direction_x),drv.In(direction_y),drv.In(direction_z),drv.In(v0_x),drv.In(v0_y),drv.In(v0_z), drv.In(v1_x),drv.In(v1_y),drv.In(v1_z), drv.In(v2_x), drv.In(v2_y), drv.In(v2_z), drv.Out(int_point_real_x),drv.Out(int_point_real_y),drv.Out(int_point_real_z), block=(num_threads, 1, 1), grid=((num_blocks+0), 1, 1))
finish = time.time()
print(finish-start)
I give as input some arrays whose size is 20k (dir_x, dir_y, dir_z) and I have as output 3 arrays (int_point_real_x,int_point_real_y,int_point_real_z) that have the same size as the above mentioned arrays (20k).
If n_rays is a multiple of num_threads, e.g. n_rays=19456 and num_threads=1024, then int_point_real_x_y_z are correctly filled by the kernel.
Otherwise, if n_rays is NOT a multiple of num_threads, e.g. n_rays=20000 (what I really need) and num_threads=1024, then int_point_real_x_y_z are filled by the kernel up to position 19455 and the 544 spots left in the array are not filled.
Does anyone know if this is a rule of CUDA?
If it's not, how could I modify my code in order to use an arbitrary size of input array (and not only multiple of num_threads)?
Thanks

your int(n_rays/num_threads) is rounding down
to fix this, you need to round up and then put a condition into the kernel to enforce that idx is valid and "do nothing" if it's not. this will cause some cores to waste time, but your code looks pretty suboptimal anyway so it probably won't matter much

How to repeat an animation in plotly

I am using the example from:
https://plot.ly/python/filled-area-animation/
How can I play the animation repeatedly?
So once the play button is clicked it will continue playing repeatedly?
Another option is to have the animation play non stop once the animation is loaded.

One solution can be found here. The idea is that instead of passing all frames beforehand you create them on the fly using the update function.
The plot is created with Plotly.animate().
You can also change this example to fit your case. Iterate through your list of data points. Once you reach the end of your array, set the counter to 0 to start at the beginning.
This example uses redraw: false, which can be used to update the plot at high frame rates.
var n = 100;
var x = [], y = [], z = [];
var dt = 0.015;
for (i = 0; i < n; i++) {
x[i] = Math.random() * 2 - 1;
y[i] = Math.random() * 2 - 1;
z[i] = 30 + Math.random() * 10;
}
Plotly.plot('graph', [{
x: x,
y: z,
mode: 'markers'
}], {
xaxis: {range: [-40, 40]},
yaxis: {range: [0, 60]}
})
function compute () {
var s = 10, b = 8/3, r = 28;
var dx, dy, dz;
var xh, yh, zh;
for (var i = 0; i < n; i++) {
dx = s * (y[i] - x[i]);
dy = x[i] * (r - z[i]) - y[i];
dz = x[i] * y[i] - b * z[i];
xh = x[i] + dx * dt * 0.5;
yh = y[i] + dy * dt * 0.5;
zh = z[i] + dz * dt * 0.5;
dx = s * (yh - xh);
dy = xh * (r - zh) - yh;
dz = xh * yh - b * zh;
x[i] += dx * dt;
y[i] += dy * dt;
z[i] += dz * dt;
}
}
function update () {
compute();
Plotly.animate('graph', {
data: [{x: x, y: z}]
}, {
transition: {
duration: 0
},
frame: {
duration: 0,
redraw: false
}
});
requestAnimationFrame(update);
}
requestAnimationFrame(update);

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

pandas .rolling().mean() analog in C# - python

Related

Calculate breakout bands of Elliott wave oscillator

Is there a way to create a file with python, containing accentuated letters, and read that with C#?

Louvain algorithm for graph clustering gives completely different result when running in Spark/Scala and Python, why is that happening?

CUDA: does size of input/output data have to be a multiple of the number of threads per block?

How to repeat an animation in plotly

Categories

Resources