-
Notifications
You must be signed in to change notification settings - Fork 23
/
Copy pathruns.log
2687 lines (1701 loc) · 63.6 KB
/
runs.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Tags: (see below for more details)
NSF1: first Tag of code for NSF petascale RFP:
NSF2: Tag of debugged dnsp code
NSF3: Tag dnsp + instrumented FFT and A2A overlaop
GPL1: Tag of code before GPL branch (7/22/2008)
==========================================================================
2/13/2009
ANL Intrepid dns x-pencil vs. p3dfft timings.
dnsp original x-pencil code
dnsp2 x-pencil code with optimization to remove y-transposes
dnsp3: code use P3DFFT
dnsp per timestep
512^3 (8x1x256) 0.02036 min
dnsp2 per timestep
512^3 (8x1x256) 0.02020 min
2048^3 (32x1x1024) 0.20168
2048^3 (128x1x256) 0.18770
dnsp3
512^3 (1x8x256) 0.01794 min
2048^3 (1x32x1024) 0.12643
2048^3 (1x64x512) 0.12675
2048^3 (1x128x256)
TIMING JUST THE FORWARD TRANSFROM
Mark Straka's timings:
out.4x1024.2048_NEW_ESSL.output: proc_id, cpu time per loop 0 0.98941431058823536
out.4x1024.2048_NEW_FFTW.output: proc_id, cpu time per loop 0 1.34859306352939257
My best ESSL time translates (dividing by 8, assuming perfect scaling :)) to a
time for 32k procs of about 0.125s. (0.1686 for FFTW)
My timings:
2048^3 1K nodes 8K cores
p3dfft 4-1024 2.895 foward & back
p3dfft 4-1024 1.249 forward only
2048^3 8K nodes 32K cores
p3dfft 128-256 0.5517 0.5497 (ERROR! timing *both* directions)
p3dfft 64-512 0.3704 0.3705 (ERROR! timing *both* directions)
p3dfft 32-1024 0.1855 0.1860 (fixed, timing only forward tnfrm)
dnsp2 128-1-256 0.3956 0.4093
dnsp2 64-1-512 hung
dnsp2 32-1-1024 0.6754 0.6723
why does that one hang? scaled down version:
on each proc: 32x2048x4
64^3
dnsp2: 2x1x16
*******************************************************************************
7/22/2008
NSF1: first Tag of code for NSF petascale RFP:
NSF2: Tag of debugged dnsp code
NSF3: Tag dnsp + instrumented FFT and A2A overlaop
GPL1: Tag of code before GPL branch (7/22/2008)
GPL1_branch: branch at GPL1
GPL1_branch_tag1 tag after adding all GPL stuff
GPL1_branch_tag2 a few more edits, including #define TRUNC in params.F90
passes these tests:
../testing/test3d_forcing.sh r
../testing/test.sh 1r
On trunc, I then ran tried:
cvs update -j GPL1 -j GPL1_branch_tag1 src
BIG MISTAKE - deletes the analysis*F90 routines deleted in GPL1_branch
thus I gave up and added GPL headers to trunc by hand.
*******************************************************************************
9/6/2007
balu's PDF idea
running balu2.inp at 1024^2 on 64 blackrose nodes (256 cores)
cost to run time=10: 12.68min
balu_a
*******************************************************************************
8/25/2007
testing boussinesq vs dns:
dnsb with theta: 10 timesteps: .32 23% more expensive
dns with 0 scalars, 10 timesteps: .26
64^3 problem, 2 cores on dosadi F7
*******************************************************************************
12/02/2006
sc2048decay
restart from sc2048A below, at time t=1.0, but with no forcing.
added u3 to diagnostics. See how this decays for Jens Lorenz.
decay2048 does not uave u3 or u4:
they are in turb-diag now, but they were not present at the time
decay2048 was run. Also, u3 does not have the absolute value.
*******************************************************************************
11/23
Benchmarks on RS/dualcore with FFTW
4096^3 done
3072^3 done
2048^3 done
1024^3 done
512^3 done
Redoing with A2A overlap:
4096^3 done
3072^3 done
Bandwidth numbers:
Running Ramanan's transpose code (MPI_alltoall in subcommunicators)
Min. over all runs, of the max time
tr max tr+copy
4096x4 .00636 .2886 (???)
4x4096 .00185 .00409
128x128 .00444 .0136
DNS code. 4096^3, using a 4x1x4096 decomposition:
transpose_to/from_x: .00725 (this is like the 4x4096 case above)
transpose_to/from_z: .0123 (this is like the 4096x4 case above)
*******************************************************************************
11/14
NOTE: 11/23: FFTW routine is over 2X faster than FFT99,
so i stoped with these benchmarks and I'm rerunning above
dns code:
RS size -d output:
4096: 804307056 * 18384 = 12272 GB 512GB per array
511175888*8*4096 = 15599GB
3072: 1417862352*3072 = 4056 GB (19 arrays)
2048: 637377744*2048 = 1215 GB (19 arrays)
at high CPU counts, it looks like there is some storage growing
faster than linear?
RS Jumbo runs
dnsp = 20 arrays = 20*8*N^3 = 1.49e-7 N^3 GB
4096^3 10240 GB min VN mode: 16384
16384 allocate() fails.
try with static allocation IN-Q
3072^3 4320 GB VN mode: 6144
CO mode
3072nodes
VN mode
18432 done
9216cores done
6144cores not enough memory. should require .7gb per core.
6144cores ok w/o MPICH_UNEX_BUFFER_SIZE 180M)
2048^2 1300 GB min VN mode: 2048 cpus.
VN mode
16384 done
8192 done
4096 done
2048 ?
1024^2 160GB min VN mode: 256 cpus
16384 ?
8192 ?
4096->256 done
512^2 ? redo this data?
16384
8192
4096
*******************************************************************************
8/23
FLOP count runs
N^3 = 16,24,32,48,64
time = -1,-2,-3
dnsp
forcing12-bench.inp (but set to 1,2 or 3 timesteps)
Couldn't get opcontrol to work. (lapic not used?)
see if FLOPS is possible:
0. sudo opcontrol --list-events
1. sudo opcontrol --no-vmlinux
2. sudo opcontrol --start
--event=FLOPS
3. ./dnsp -i temp.inp
4. opreport -l ./dnsp
2. sudo opcontrol --shutdown
EVENT = FLOPS
On Cobalt: (NCSA)
qsub -I -V -l walltime=00:30:00,ncpus=8,mem=8gb
pfmon -efp_ops_retired dnsp -i temp.inp
%module load histx+.1.2a
%lipfpm -h
%lipfpm -e CPU_CYCLES -e FP_OPS_RETIRED ./a.out
resolution: 16^3 flops FLOP per timestep:
-6 80161720 diff: 8949342
-5 71212378 diff: 8949549
-4 62262829 idff: 8949448
time=-3(4 timesteps): 53313381 diff: 8949364
time=-2(3 timesteps): 44364017 diff: 9230453
time=-1(2 timesteps): 35133564
resolution: 32^3 flops diff
-6 696585740 78413858
-5 618171882 78414194
-4 539757688 78413964
-3 461343724
-3 NCPU=2 65139
1834353
resolution: 64^3 flops diff
-6 6189080299 702394764
-5 5486685535 702395163
-4 4784290372 702395166
-3 4081895206
resolution: 128^3 flops diff
-4 41068986552 6062569260
-3 35006417292
resolution: 256^3 flops diff
-4 358258523792 53102820392 26min
-3 305155703400
resolution: 12^3 flops diff
-4 26885228 3873213
-3 23012015
resolution: 24^3 flops diff
-6 300568043 33806552
-5 266761491 33807033
-4 232954458 33806951
-3 199147507
resolution: 48^3 flops diff
-6 2662196785 302423113
-5 2359773672 302423661
-4 2057350011 302423679
-3 1754926332
resolution: 96^3 flops diff
-4 17571976245 2595103294
-3 14976872951
resolution: 192^3 flops diff
-4 153149117982 22711603057
-3 130437514925 (5min)
Fit to: c0 + c1 N^3 + c2 27 N^3 log2(N): (because we do 27 FFT's)
clear; N=[16,32,64,128,256];
for i=1:length(N)
A(i,:) = [ 4*N(i)^3 , 4*27*log2(N(i))*N(i)^3 ]/N(i)^3;
end
b = [8949342 ; 78413858 ; 702394764; 6062569260; 53102820392]' ./ (N.^3);
c=A\b'; % solve A * [c0;c1;c2] = b
semilogx(N,b,'o'); hold on;
x=12:256; f=(c(1)*4*x.^3+c(2)*4*27*log2(x).*x.^3)./(x.^3);
plot(x,f); hold off;
disp(sprintf('%10.1f N^3 + %5.3f 27 N^3 log2(N) ',c(1),c(2)));
(A*c - b')./b'
# answer: 4( 297 N^3 + 2.28 27 N^3 log2(N) )
# error: less that 1%
clear; N=[12, 24,48,96,192];
b = [ 3873213; 33806552 ; 302423113; 2595103294;22711603057 ]'./(N.^3);
for i=1:length(N)
A(i,:) = [ 4*N(i)^3 , 4*27*N(i)^3*log2(N(i)) ]/N(i)^3;
end
c=A\b'; % solve A * [c0;c1;c2] = b
loglog(N,b,'o'); hold on;
x=12:256; f=(c(1)*4*x.^3+c(2)*4*27*log2(x).*x.^3)./(x.^3);
plot(x,f); hold off;
disp(sprintf('%10.1f N^3 + %5.3f 27 N^3 log2(N) ',c(1),c(2)));
(A*c - b')./b' % residula
# answer: 4( 340 N^3 + 2.24 27 N^3 log2(N) )
# error: less that 1%
*******************************************************************************
8/23
2048^3: 5 days on 4096 cores.
3072^3: 25 days on 4096 cores.
Redstorm run:
sc2048A: (see sc2048A.job, sc2048A.inp)
This run should take 13K timesteps. on 4096cores, .5min per = 120h
2048 run modeled after sc1024A:
sc2048A data:
ke = 1.95 eps=4.00 mu=1.35e-5 kmax=965*2*pi
kmax*eta = .959 eta=.000158 ett = .976
eyeball averages over last 25% of run:
ke = 1.90 eps=3.8 mu=1.35e-5 kmax=965*2*pi
(and this agrees better with sc1024A data) R_lambda = 700
sc1024A data:
ke=1.886 eps=3.58 mu = .35e-4
eta*kmax_spherical = eta * nx*pi*2*sqrt(2)/3 = 1.0
sc3072A modeled after sc2048A data:
ASSUME eps=4.00 kmax=1448*2*pi
nu = eps^(1/3) k_max^(-4/3) = .84e-5
IN, but always gets ec_node failuer errors.
dsacp rs:/scratch2/mataylo/sc2048A hpss:dns
*******************************************************************************
7/27
NSF1: first Tag of code for NSF petascale RFP:
NSF2: Tag of debugged dnsp code
NSF3: Tag dnsp + instrumented FFT and A2A overlaop
*******************************************************************************
7/20/2006
N=12288
Rl = 2000
delt=.0001 ett
vorticity, velocity, pressure saved every .02 ett
my delt: sqrt(ke) delt/delx = .17 (see below)
ett = 2*ke/eps
lambda=sqrt( mu*(2*Euse/3) ./ (epsilon/15) );
Rl = lambda*sqrt( 2 KE/3) / mu
Rl = KE sqrt(20/3 / (mu*epsilon))
eps = mu * grad(u)^2
eta = (mu^3/epsilon)^.25 = delta_x/3 mu ~= N^(-4/3)
2x = .40
12x = .036
Resolution Requirements N^3, 2/3 dealiasing:
ke=1.886 eps=3.58
eta*kmax = eta * 2*pi*(N/3)
eta*kmax = (mu^3/epsilon)^.25 * N*pi*(2/3)
mu^3 = eps*[(eta*kmax)/(pi*2N/3)]**4
mu = eps**(1/3) * [(eta*kmax)/(2*pi*N/3)]**4/3
Rl = ke sqrt(20/3/(mu*eps)
12288 run modeled after sc1024A:
ke=1.886 eps=3.58 mu = 1.25e-6
delt = 1e-5 sqrt(ke) delt/delx = .17
Rl = 2290
eta*kmax_spherical = eta * nx*pi*2*sqrt(2)/3 = 1.0
eta*kmax_2/3 = eta * nx*pi*(2/3) = .70
12288 with Rl=2000
ke=1.886 eps=3.58 mu = 1.6e-6
delt = 1e-5 sqrt(ke) delt/delx = .17
Rl = 2034
eta*kmax_spherical = eta * nx*pi*2*sqrt(2)/3 = 1.2
eta*kmax_2/3 = eta * nx*pi*(2/3) = .84
12288 with Rl=2000
ke=1.886 eps=3.58 mu = 2e-6
delt = 1e-5 sqrt(ke) delt/delx = .17
Rl = 2034
eta*kmax_spherical = eta * nx*pi*2*sqrt(2)/3 = 1.2
eta*kmax_2/3 = eta * nx*pi*(2/3) = .84
sc4096A: should take 20 days on 16K cores
sc3072A: 8.3 days on 12K cores
sc1024A runs:
eps: 3.58
eta: .000331 eta/delx = .3388
spherical: eta * kmax = eta* nx*pi*2*sqrt(2)/3 = 1.0034
2/3: eta * kmax = eta* nx*pi*(2/3) = .7095
Rl=435
ett=1.05
mu=3.5e-5
maxU = 6.18,6.33,6.04
maxUcfl = 11.77
ke = 1.886 sqrt(ke)=1.37 = maxU/4.5
eps=3.58 k eps-2/3 = .806
delt ~ 1.2e-4
My CFL: (delt=.0001)
11.77 * delt/delx = 1.5 delt< .13/N OR
8.6 sqrt(ke) delt/delx = 1.5 1.9*maxU delt/delx = 1.5
sqrt(ke) delt/delx = .17 maxU delt/delx = .8
Pope: sqrt(k) delt/delx = .05
maxU delt/delx = .22
64^3 forcing12.inp case (ke=1.71)
.005 sqrt(k) delt/delx = .41 blows up 12 timesteps
.0045 blows up t=3
.004 sqrt(k) delt/delx = .333 stable to t=10
eta*kmax=1, eps=3.58
mu = eps**(1/3) * [(eta*kmax)/(2*pi*N*sqrt(2)/3)]**4/3
= .36 N**(4/3)
mu mu (formula)
512 1e-4 .88e-4
768 5.1e-5 cpus: 128, 192, 256, 384, [2,3,4,6,8]x384
1024 3.5e-5 3.5e-5
1536 2.0e-5
2048 1.38e-5
3072 8.06e-6
4096 5.5e-6
12288 1.25e-6 1.3e-6
*******************************************************************************
3/6/2006
Kerr's compression of our 2048^3 data set project.
TODO:
PYTHON: read compress.o
find time in decay.out
print out comparision: scalars.m, 2048^3 read
how to get files off of HPSS and onto lindor?
ccs-2 machine "lindor": /usb1 /usb2 /usb3
psi doesn't support 2GB files
sftp hpss.lanl.gov connection refused
to get files off of HPSS onto QSC:
xpsi get --pfsComp 32 filename
2048^3 = 192 GB snapshots 6min to read
1440^3 = 67 GB snapshots 22min to read/write
1440z = 22 GB snapshots 1min to read/write
2048 -> 1440 grid to spectral truncated 3 arrays = 192 GB
6min to read, 22min to write
1440 -> 1440 spectral to compressed, save stats! 8 arrays = 178 GB
22min to read, 1min to write
1440 stats 8 arrays = 178 GB
1min to read
70 files, 192GB = 13TB
22GB = 1.5TB
list of times:
0000.4020
0000.4026
0000.4188
0000.4328
0000.4603
0000.4894
0000.5551
0000.6034
0000.6491
0000.7019
0000.7536
0000.8149
0000.8545
0000.9040
0000.9512
0001.0017
0001.0511
0001.1038
0001.1598
0001.1959
0001.2500
0001.3081
0001.3457
0001.4034
0001.4434
0001.5038
0001.5439
0001.6124
0001.6586
0001.7075
0001.7569
0001.8033
0001.8484
0001.8988
0001.9493
0001.9955
0002.0416
0002.1140
0002.1581
0002.2066
0002.2609
0002.3157
0002.3700
0002.4247
0002.4519
0002.5090
0002.5500
0002.6100
0002.6700
0002.7021
0002.7621
0002.7922
0002.8522
0002.9181
0002.9511
0003.0145
0003.0812
0003.1481
compress-in: started 5/2 evening
0003.2146
0003.2838
0003.3563
0003.3924
0003.4596
save-in
0003.5279
0003.6027
0003.6781
0003.7527
0003.7900
=============
Test case on D800:
64^3 using "decay.inp" initial condition
temp0001.0000.[uvw]
OS/porting glitches: (D800, LAM MPI)
MPI_REAL8 not defined. put in MPI_REAL8=MPI_DOUBLE_PRECISION
put in MPI_OFFSET for arguments to mpi_file_seek.
bug (fixed): mpi_bcast on character*16 arrays:
length argument to mpi_bcast has to be multiplied by 16,
since each string has 16 characters.
bug (fixed) bottom of subroutine SETUP:
Most of the input is read only be process 0 and then
broadcast to the other processes. But at the end of
SETUP all processes were doing some input file reads,
and for me, the my_id<>0 processes were crashing.
bug (not fixed) SPEC7
in RSSTIO, the call to spec7() hangs. I tracked this down
to the mpi_allreduce of length "m1". Two processes have m1=0,
while the remaining processes have m1=6.
The calls to spec8, spec9 and spec10a also hang, but
I didn't track down what was causing this.
I just commented out all of these calls.
bug (not fixed) FSPASS
same "m1" problem. In the mpi_allreduce, 6 of the 8 processes think
m1=3, but the remaining processes thik m1=0.
This problem I tracked down, but it took my quite a lot of
debugging. The problem turns out to be with 8 processors,
using your c16 data file, some of the processes do not
execute the bug "j1" loop in FSPASS, and hence they do not
call SPEC23D (which is where m1 seems to be set), and so
when the call SPEC23D_T, they have m1=0 and the all_reduce fails.
Set jdebug=0 on Kerr's advice to avoid all of the above problems.
on 8 cpus, it hangs. On 4 cpus, there is a problem in RSPASS,
the allocate statement, the second dimension, nword3b=0
RSPASS: nword3b=0 disable allocate on Kerr's advice
It seems uw still needs to be allocated, so I only dont
allocate u if nword3b=0.
then I got a strange error about memory allocation, that went
away if i added ",STAT=istat" to all the allocate statements
(and I check istat after each one to make sure the allocate
worked).
1 and 4 cpus the code now runs and produces output,
but the output is different. I've attached:
1.out stdout for 1 cpu run
fsave.1 'fsave' output file
4.out stdout for 1 cpu run
fsave.4 'fsave' output file
on 2 cpus, the code crashes (I haven't tracked this down yet)
and on 8 cpus the code still hangs, but I haven't tracked that
down yet either.
2 cpu code: added another STAT= and fixed the problem.
8 cpu case: problem is m1=3,0 in spec23_d
*******************************************************************************
6/30/2005
rotation case (modeled after leslie smith's run)
to add passive scalar:
./gridsetup.py 1 1 32 128 128 128 2 2 2 0 0 0 4
256x256x32 R0=.88, .48, .16
R0 = (epsilon_f * (2pi k_f)**2 )**(1/3) / (.5*fcor)
Leslie's runs:
fcor =13.8 R0=.88 E saturates
fcor = 76 R0=.16 E monotone increasing. ran for t=170 = 6460 revs.
2 Omega = fcor
new input file:
initial cond: none subtype=0
forcing: iso_high_16
Lz = 1/8
fcor = choosen to achieve R0 given above
Bous = 0
hyper4 = del**4 = laplacian**2
coefficient = 1 (auto scaled)
My first runs: (dosadi) not saved. used for debugging and tuning:
128x128x32
k_f = 8 fcor=15 mu_hyper=.01, .1, 1.0, 10.0
epsilon_f ~= .5
epsilon = .41
Everything looks good: but no inverse cascade with fcor=15
had to go up to fcor=150 to get an inverse cascase.
New runs:
rotA 128x128x32
k_f=16 fcor=15 mu_hyper=1.0 mu=0
epsilon_f .52
epsilon_ke .52
R0 = 2.32
No inverse cascade.
rotB 128x128x32
k_f=16 fcor=40 mu_hyper=1.0 mu=0
eddy turnover time: .72
epsilon_f .52
epsilon_ke .51
R0 = .87
KE: .1 -> .22 in time=10
1/Omega = time for one revolution:
time * Omega = number of revolutions of run
Grashof number: ||f|| L^(1.5) / nu^2
*******************************************************************************
10/3/2004
New Monika VXPAIR case:fcor, Z scale and Bous paramemter are in input file
init_subtype==6 (see cases2v.F90)
delta=.1
viscosity = 1e-7
./gridsetup.py NCPU 1 1 1 4800 2880 1 2 2 0 2 2 0 2
./gridsetup.py NCPU 1 1 1 2400 1440 1 2 2 0 2 2 0 2
KIWI: 4 cpus time(min) per timestep: .468 timestep ~ .0011s
2 cpus .644
qsc: 32 .062
64 .0371 1.09 days
shankara: (some of these timings were not using 2 cpu per node,
I should redo them)
2 .666
4 .391
8: .255
16: .161 (4.7 days to t=50)
24 .109 3.2 days
shankara-lam
4: 1.51
16: .738
delt=.0012
time_per_timestep(m)*50/.0012/60 = time for run in hours
vx4800a delta=.1, viscosity=1e-7 init_subtype=6
4800x2880
yscale=1.8 xlocation=1.5
vx2400a: same input file as above
delta = .1, viscosity=1e-7
ubar = 0.098
jack ran this problem: 2640x1520 domain: [-1.7 , 1.6] x [0 1.9]
I'm running: 2400x1440 domain: [-1.5 , 1.5] x [0 1.8]
IN shankara
NEXT:?
vx4800b NEW RUN with delta=.05, viscosity=1e-7. same ubar? init_subtype?
**********************************************************************************
10/3/2004
Evelyn found a bug during restart, at 2048^2? (or was it 4096^2)
Checking 2048^2 on 4 cpus (kiwi)
./gridsetup.py 16 1 1 4096 2048 1 2 2 0 2 2 0 2
eve1 (done) output: .01, .02
eve2 (done) restart from .01, run to .02
**********************************************************************************
1/30/04
fractional sturcture functions for Susan Kurien
exponents: -.8, -.6, -.4, -.2, ...
que'd up all runs
complete archived cnslgw
2.4 x x x
2.3 x x
2.2 x x x
2.1 x x
2.0 x x
1.9 IN floating point divide by zero?
1.8 x x
1.7 x x
1.6 RUN missing data - dont use
1.5 x
1.4 x
1.3 x x
1.2 x
1.1 x
1.0 x x
**********************************************************************************
12/1/03
cospec results:
cospec from sc1024A: looks like -7/3?
cospec from decay2048: IN
cospec from subcubes:
**********************************************************************************
11/6/03
step 1:
set restart=0
run tmix256D-noscalars.inp used to generate initial condtion
run to t=.3
step 2:
set restart=1
leave "name=tmix256D-noscalars", but set "refin=tmix256D-rescale.inp"
run to t=0, just to generate rescaled initial condition
rename the output: tmix256D0000.3000-noscalars-rescale.* tmix256D000.3000.*
step3
set name=tmix256D
rename directory tmix256D-noscalars tmix256D
run restart run, with compute new passive scalars
step4
regular restart (uvw and passive scalers)
tmix256D input file parameters:
init_cond_subtype==3
spectrum peaked at k=6 (instead of 10)
lowered mu from 3e-4 to 1.75e-4 because of change in k_peak.
(kmax*eta at t=0 is 1.5)
Change KE corr. scalar from type 1 to 3: ke_thresh = .5? debug.
add to matlab: Sk_{ln eps_c} = <psi^3>/<psi^2>^{3/2}
K_{ln eps_c} = <psi^4>/<psi^2>^{2}
Check with Ray:
velocity: now peaked at 6, not 10. slope still k**2 for k<6
scalars:
Gaussian scalar: same as before: double delta
KE correlated scalar: distribution chosen so that: c=1 peak 6x larger than
c=0 peak.
need to change subtype and scalar type in new .inp file.
**********************************************************************************
10/24/03
VXPAIR Kras initial condition
init_cond_subtype=100
./gridsetup.py 16 1 1 640 512 1 2 2 0 2 2 0 2
make dnsvor
see RUNME script in parent directory
vx2560a
2560x2048 (400MB) IN ~9 hours to t=5 on 4 cpus milkyway
data stored in ~/data/kras
init_cond_subtype=100
mu=1e-6 should also be able to do 1e-7?
hard coded: (code may change later)
biotsavart_cutoff=5e-3
biotsavart_apply=50
delta=.1
biotsavart_ubar=.100
vx2560b
2560x2048 (400MB) IN l1
data stored in /netscratch/taylorm/kras
init_cond_subtype=100
mu=1e-7 should also be able to do 1e-7?
hard coded: (code may change later)
biotsavart_cutoff=5e-3
biotsavart_apply=50
delta=.1
biotsavart_ubar=.1
vx2560c
2560x2048 (400MB) IN l1