Sometimes when I run my Python script which calls shp2pgsqlto upload a new table to the database, when I view this table in pgadmin, it appears with blank column names:
This one has column names
Usually when I run the script again it fixes the problem, and pgadmin displays a message about database vacuuming. Honestly the problem is my boss because he takes this as a sign there is something wrong with my code and we can't move forward until he sees the names in pgadmin (by chance when I demonstrated the script it was the 1/10 time that it messed up without the column names).
In postgres is it even possible to have a table without column names?
Here is the vacuum message
Here is the output from psql's \d (assume XYZ is the name of the project and the name of the db)
xyz => \d asmithe.intersect
Table "asmithe.intersect"
Column | Type | Modifiers
------------+------------------------------+------------------------------------
------------------------
gid | integer | not null default nextval('intersect
ion_gid_seq'::regclass)
fid_xyz_09 | integer |
juris_id | character varying(2) |
xyz_plot | numeric |
poly_id | character varying(20) |
layer | character varying(2) |
area | numeric |
perimeter | numeric |
lid_dist | integer |
comm | character varying(252) |
cdate | character varying(30) |
sdate | character varying(30) |
edate | character varying(30) |
afsdate | character varying(30) |
afedate | character varying(30) |
capdate | character varying(30) |
salvage | double precision |
pb_harv | double precision |
utotarea | numeric |
nbacvers | character varying(24) |
totarea | numeric |
areamoda | numeric |
areamodb | numeric |
areamodt | double precision |
areamodv | numeric |
area_intr | numeric |
dist_perct | numeric |
id | double precision |
floodid | double precision |
basr | double precision |
floodmaps | double precision |
floodmapm | double precision |
floodcaus | double precision |
burnclas | double precision |
geom | geometry(MultiPolygon,13862) |
Indexes:
"intersect_pkey" PRIMARY KEY, btree (gid)
Quitting and restarting usually does fix it.
In postgres is it even possible to have a table without column names?
It is possible to create a table with zero columns:
test=> CREATE TABLE zerocolumns();
CREATE TABLE
test=> \d zerocolumns
Table "public.zerocolumns"
Column | Type | Modifiers
--------+------+-----------
but not a zero-width column name:
test=> CREATE TABLE zerowidthcol("" integer);
ERROR: zero-length delimited identifier at or near """"
LINE 1: CREATE TABLE zerowidthcol("" integer);
^
though a column name composed only of a space is permissible:
test=> CREATE TABLE spacecol(" " integer);
CREATE TABLE
test=> \d spacecol
Table "public.spacecol"
Column | Type | Modifiers
--------+---------+-----------
| integer |
Please show the output from psql's \d command if this happens. With only (heavily edited) screenshots I can't tell you anything more useful.
If I had to guess I'd say it's probably a drawing bug in PgAdmin.
Update: The VACUUM message is normal after big changes to a table. Read the message, it explains what is going on. There is no problem there.
There's nothing wrong with the psql output, and since quitting and restarting PgAdmin fixes it, I'm pretty confident you've hit a PgAdmin bug related to drawing or catalog access. If it happens on the current PgAdmin version and you can reproduce it with a script you can share with the public, please post a report on the pgadmin-support mailing list.
The same happened to me in pgAdmin 1.18.1 when running the DDL (i.e. SQL script that drops and recreates all tables). After restarting pgAdmin or refreshing the database it is working again (just refreshing the table is not sufficient). It seems that pgAdmin simply does not auto-refresh table metadata after the tables are replaced.
Related
here in this table https://www.db-fiddle.com/f/mU8RhMyiNb6RBjdaYZztci/0
i try to insert two names girisken, girişken and for mysql this is a duplicate and the second one can not be insert because the first exists and i am fine
but in python when i do
a = 'girisken'
b = 'girişken'
print(a == b)
it gives me false and that makes problems trying to insert it to mysql
how to solve this problem?
my collation is utf8mb4_unicode_520_ci in mysql
Character equivalence is based on the collation.
In MySQL's default collation utf8mb4_0900_ai_ci, 's' is equal to 'ş':
mysql> select 'girisken' = 'girişken' as same;
+------+
| same |
+------+
| 1 |
+------+
I infer that this word is from the Türkçe language (I assume since the country now wants to be known as Türkiye, their language maybe should be known as Türkçe?).
So the proper collation should be chosen for that language:
mysql> select 'girisken' = 'girişken' collate utf8mb4_turkish_ci as same;
+------+
| same |
+------+
| 0 |
+------+
Or in MySQL 8.0, an updated collation is available:
mysql> select 'girisken' = 'girişken' collate utf8mb4_tr_0900_ai_ci as same;
+------+
| same |
+------+
| 0 |
+------+
I have a table which has columns named measured_time, data_type and value.
In data_type, there is two types, temperature and humidity.
I want to combine two rows of data if they have same measured_time using Django ORM.
I am using Maria DB.
Using Raw SQL, The following Query does what I want to.
SELECT T1.measured_time, T1.temperature, T2.humidity
FROM ( SELECT CASE WHEN data_type = 1 then value END as temperature,
CASE WHEN data_type = 2 then value END as humidity ,
measured_time FROM data_table) as T1,
( SELECT CASE WHEN data_type = 1 then value END as temperature ,
CASE WHEN data_type = 2 then value END as humidity ,
measured_time FROM data_table) as T2
WHERE T1.measured_time = T2.measured_time and
T1.temperature IS NOT null and T2.humidity IS NOT null and
DATE(T1.measured_time) = '2019-07-01'
Original Table
| measured_time | data_type | value |
|---------------------|-----------|-------|
| 2019-07-01-17:27:03 | 1 | 25.24 |
| 2019-07-01-17:27:03 | 2 | 33.22 |
Expected Result
| measured_time | temperaure | humidity |
|---------------------|------------|----------|
| 2019-07-01-17:27:03 | 25.24 | 33.22 |
I've never used it and so can't answer in detail, but you can feed a raw SQL query into Django and get the results back through the ORM. Since you have already got the SQL this may be the easiest way to proceed. Documentation here
I have the following models which represent songs and the plays of each song:
from django.db import models
class Play(models.Model):
play_day = models.PositiveIntegerField()
source = models.CharField(
'source',
choices=(('radio', 'Radio'),('streaming', 'Streaming'), )
)
song = models.ForeignKey(Song, verbose_name='song')
class Song(models.Model):
name = models.CharField('Name')
Image I have the following entries:
Songs:
|ID | name |
|---|---------------------|
| 1 | Stairway to Heaven |
| 2 | Riders on the Storm |
Plays:
|ID | play_day | source | song_id |
|---|----------|-----------|---------|
| 1 | 2081030 | radio | 1 |
| 1 | 2081030 | streaming | 1 |
| 2 | 2081030 | streaming | 2 |
I would like to list all the tracks as follows:
| Name | Day | Sources |
|---------------------|------------|------------------|
| Stairway to Heaven | 2018-10-30 | Radio, Streaming |
| Riders on the Storm | 2018-10-30 | Streaming |
I am using Django==1.9.2, django_tables2==1.1.6 and django-filter==0.13.0 with PostgreSQL.
Problem:
I'm using Song as the model of the table and the filter, so the queryset starts with a select FROM song. However, when joining the Play table, I get two entries in the case of "Stairway to Heaven" (I know, even one is too much: https://www.youtube.com/watch?v=RD1KqbDdmuE).
What I tried:
I tried putting a distinct to the Song, though this yields the problem that I cannot sort for other columns than the Song.id (supposing I do distinct on that column)
Aggregate: this yields a final state, actually, a dictionary and which cannot be used with django_tables.
I found this solution for PostgreSQL Selecting rows ordered by some column and distinct on another though I don't know how to do this with django.
Question:
What would be the right approach to show one track per line "aggregating" information from references using Django's ORM?
I think that the proper way to do it is to use the array_agg postgresql function (http://postgresql.org/docs/9.5/static/functions-aggregate.html and http://lorenstewart.me/2017/12/03/postgresqls-array_agg-function).
Django seems to actually support this (in v. 2.1 at least: http://docs.djangoproject.com/en/2.1/ref/contrib/postgres/aggregates/) thus that seems like the way to go.
Unfortunately I don't have time to test it right now so I can't provide a thorough answer; however try something like: Song.objects.all().annotate(ArrayAgg(...))
I'm having trouble entering string values into a MySQL table without quotation marks via Django/Python.
Currently, I have the following schema:
Field | Type | Null | Key | Default | Extra
ticker | varchar(10) | NO | PRI | NULL | |
In my population script, I have values[0], which references the first value in an array, previously split on comma.
>>> values[0], type(values[0])
('"AAPL"', <type 'str'>)`
I then assign this value to my model and save it:
fi = MarketData()
fi.ticker = values[0]
fi.save()
In the database table, this value is stored with double quotes:
+--------+
| ticker |
--------+
| "AAPL" |
+--------+
This prevents my from joining on a separate table where the primary key is also the ticker, and the values in that able are stored without any type of quotations.
My Django model:
ticker = models.CharField(max_length=10, primary_key=True, db_column='ticker')
I have tried converting the double strings to single, and escaping, but haven't had much luck. Thanks.
Strip the quotation using strip
new_value = value[0].strip('"') # for double quoted strings
You can also do it for single-quoted strings:
new_value = value[0].strip("'") # for single-quoted strings
Then store it in the database.
I have the following type of data:
The data is segmented into "frames" and each frame has a start and stop "gpstime". Within each frame are a bunch of points with a "gpstime" value.
There is a frames model that has a frame_name,start_gps,stop_gps,...
Let's say I have a list of gpstime values and want to find the corresponding frame_name for each.
I could just do a loop...
framenames = [frames.objects.filter(start_gps__lte=gpstime[idx],stop_gps__gte=gpstime[idx]).values_list('frame_name',flat=True) for idx in range(len(gpstime))]
This will give me a list of 'frame_name', one for each gpstime. This is what I want. However this is very slow.
What I want to know: Is there a better way to preform this lookup to get a framename for each gpstime that is more efficient than iterating over the list. This list could get faily large.
Thanks!
EDIT: Frames model
class frames(models.Model):
frame_id = models.AutoField(primary_key=True)
frame_name = models.CharField(max_length=20)
start_gps = models.FloatField()
stop_gps = models.FloatField()
def __unicode__(self):
return "%s"%(self.frame_name)
If I understand correctly, gpstime is a list of the times, and you want to produce a list of framenames with one for each gpstime. Your current way of doing this is indeed very slow because it makes a db query for each timestamp. You need to minimize the number of db hits.
The answer that comes first to my head uses numpy. Note that I'm not making any extra assumptions here. If your gpstime list can be sorted, i.e. the ordering does not matter, then it could be done much faster.
Try something like this:
from numpy import array
frame_start_times=array(Frame.objects.all().values_list('start_time'))
frame_end_times=array(Frame.objects.all().values_list('end_time'))
frame_names=array(Frame.objects.all().values_list('frame_name'))
frame_names_for_times=[]
for time in gpstime:
frame_inds=frame_start_times[(frame_start_times<time) & (frame_end_times>time)]
frame_names_for_times.append(frame_names[frame_inds].tostring())
EDIT:
Since the list is sorted, you can use .searchsorted():
from numpy import array as a
gpstimes=a([151,152,153,190,649,652,920,996])
starts=a([100,600,900,1000])
ends=a([180,650,950,1000])
names=a(['a','b','c','d',])
names_for_times=[]
for time in gpstimes:
start_pos=starts.searchsorted(time)
end_pos=ends.searchsorted(time)
if start_pos-1 == end_pos:
print time, names[end_pos]
else:
print str(time) + ' was not within any frame'
The best way to speed things up is to add indexes to those fields:
start_gps = models.FloatField(db_index=True)
stop_gps = models.FloatField(db_index=True)
and then run manage.py dbsync.
The frames table is very large, but I have another value that lowers
the frames searched in this case to under 50. There is not really a
pattern, each frame starts at the same gpstime the previous stops.
I don't quite understand how you lowered the number of searched frames to 50, but if you're searching for, say, 10,000 gpstime values in only 50 frames, then it's probably easiest to load those 50 frames into RAM, and do the search in Python, using something similar to foobarbecue's answer.
However, if you're searching for, say, 10 gpstime values in the entire table which has, say, 10,000,000 frames, then you may not want to load all 10,000,000 frames into RAM.
You can get the DB to do something similar by adding the following index...
ALTER TABLE myapp_frames ADD UNIQUE KEY my_key (start_gps, stop_gps, frame_name);
...then using a query like this...
(SELECT frame_name FROM myapp_frames
WHERE 2.5 BETWEEN start_gps AND stop_gps LIMIT 1)
UNION ALL
(SELECT frame_name FROM myapp_frames
WHERE 4.5 BETWEEN start_gps AND stop_gps LIMIT 1)
UNION ALL
(SELECT frame_name FROM myapp_frames
WHERE 7.5 BETWEEN start_gps AND stop_gps LIMIT 1);
...which returns...
+------------+
| frame_name |
+------------+
| Frame 2 |
| Frame 4 |
| Frame 7 |
+------------+
...and for which an EXPLAIN shows...
+----+--------------+--------------+-------+---------------+--------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+--------------+-------+---------------+--------+---------+------+------+--------------------------+
| 1 | PRIMARY | myapp_frames | range | my_key | my_key | 8 | NULL | 3 | Using where; Using index |
| 2 | UNION | myapp_frames | range | my_key | my_key | 8 | NULL | 5 | Using where; Using index |
| 3 | UNION | myapp_frames | range | my_key | my_key | 8 | NULL | 8 | Using where; Using index |
| NULL | UNION RESULT | <union1,2,3> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+--------------+-------+---------------+--------+---------+------+------+--------------------------+
...so you can do all the lookups in one query which hits that index, and the index should be cached in RAM.