Comments on Fast parallel computing with Intel Phi coprocessorsTypePad2015-05-14T01:34:27ZBlog Administratorhttps://blog.revolutionanalytics.com/tag:typepad.com,2003:https://blog.revolutionanalytics.com/2015/05/behold-the-power-of-parallel/comments/atom.xml/diego commented on 'Fast parallel computing with Intel Phi coprocessors'tag:typepad.com,2003:6a010534b1db25970b01bb0833c5af970d2015-05-25T01:51:04Z2015-05-27T18:15:08ZdiegoGreaat post. What moherboard did you get for the Xeon Phi coprocessors?<p>Greaat post. What moherboard did you get for the Xeon Phi coprocessors?</p>Marcos F commented on 'Fast parallel computing with Intel Phi coprocessors'tag:typepad.com,2003:6a010534b1db25970b01bb08333b27970d2015-05-23T14:40:15Z2015-05-27T18:15:08ZMarcos FDear Andrew: For your own good, learn about Sparse Arrays... For example, using Mathematica: a = 10000; b = 10^-9;...<p>Dear Andrew:</p>
<p><br />
For your own good, learn about Sparse Arrays...</p>
<p><br />
For example, using Mathematica:<br />
a = 10000; b = 10^-9;<br />
cc = SparseArray[{{1, 1} -> 1 - b, {a, a} -> 1, {a, a - 1} -> 1,<br />
{i_, i_} -> 1 - 1.75*b, {i_, j_} /; i - 1 == j -> <br />
b, {i_, j_} /; i == j - 1 -> .75}, {a, a}, 0]</p>
<p><br />
It will take LESS THAN A SECOND to calculate this.<br />
Timing[dd = MatrixPower[cc, 100]]</p>
<p><br />
This are the values of the first row:<br />
1., 75.0002, 2784.38, 68217.3, ...</p>Drew commented on 'Fast parallel computing with Intel Phi coprocessors'tag:typepad.com,2003:6a010534b1db25970b01b8d11746c9970c2015-05-20T18:06:29Z2015-05-27T18:15:08ZDrewHi Everyone, @Jake The numbers are correct. I ran them multiple times. I lost a month+ earlier in the winter...<p>Hi Everyone, </p>
<p>@Jake</p>
<p>The numbers are correct. I ran them multiple times. I lost a month+ earlier in the winter semester because my original software took so long. </p>
<p>I tried a couple different methods earlier from my numerical analysis class. When I talked to my prof about the issues I had, it came down to float point errors. My eigenvalues values all ended up being 1.00. </p>
<p>I got the phi from Sabrepc.com. They have 3 phis at a low price. </p>
<p>@buddy</p>
<p>On my Gamer comouter, I get 100% utilization. One the workstation and my laptop,(both intel processors) I get 50% utilization. I run simutations from BOINC, which use 100% of my processors on all my computers. The core temps on the xeon processor and cooling fan speeds seem to correlate to 50% utilization too. One of my fans is really loud at full throttle, when I run BOINC For several hours. When I run my larger program, the fans are not that loud. </p>
<p>@Radford<br />
I did download Ubantu. The phi drivers work on red hat, not ubantu. I spent a few days trying to get the phi working on ubantu. I gave up. Since I wasted so much time earlier in the semester, I didn't have a week or two more to figure out ubantu. As is, I turned in my paper 2 days before the end of the semester. </p>
<p>@Jang<br />
I'm not sure what you are getting at. Matrix c is a sparse matrix where c (n, n) is approximately 0.999999998. I ran multiple iterations of the program with a=500, 1000, 2500, 5000, 10000. To cut down on programming mistakes, I used a=# to declare the size of the matrix. </p>
<p>@Groth<br />
From my (poor) understanding, if you can call upon the intel mkl and set the environment variables, the automatic offload does the rest. You can optimize the programming. I have a copy of Intel's parallel suite, which I got free for a year. (Sometimes being a student isn't so bad.) I'm not sure if I even need it. I haven't done anything with it yet, that I know of. </p>
<p>Thanks for reading and commenting. </p>tomw commented on 'Fast parallel computing with Intel Phi coprocessors'tag:typepad.com,2003:6a010534b1db25970b01b7c78daa7f970b2015-05-20T17:11:35Z2015-05-27T18:15:08ZtomwI have been looking into the Phi cards are well. Our simulation software is in FORTRAN. I wonder what the...<p>I have been looking into the Phi cards are well. Our simulation software is in FORTRAN. I wonder what the performance would be with FORTRAN compiled with the Intel compiler specifically made for use with the Phi cores? This can be found as a bundle packages 'starting for under $5k' https://software.intel.com/en-us/xeon-phi-starter-kit</p>
<p>Also has anybody re-compiled the R core using the intel starting kit - specifically optimizing for the Phi cores? I assume that is what Revolution has done - to gain access to the Phi cores and intel math libraries? </p>G. Grothendieck commented on 'Fast parallel computing with Intel Phi coprocessors'tag:typepad.com,2003:6a010534b1db25970b01b8d1172744970c2015-05-20T14:43:39Z2015-05-27T18:15:08ZG. GrothendieckThe matrix shown is a projection matrices and has the property that c raised to any power is c so...<p>The matrix shown is a projection matrices and has the property that c raised to any power is c so if that is the form of your actual matrix you can avoid the multiplication entirely.</p>jangorecki commented on 'Fast parallel computing with Intel Phi coprocessors'tag:typepad.com,2003:6a010534b1db25970b01bb08312b2f970d2015-05-19T23:50:02Z2015-05-27T18:15:08ZjangoreckiLooks like you already invested a lot of time for the configuration. I really think you should try linux (ubuntu...<p>Looks like you already invested a lot of time for the configuration. I really think you should try linux (ubuntu can be quite good for win users), using just a little part of time you've invested already you would have a good base to start HPC without any M$ artifacts.</p>Radford Neal commented on 'Fast parallel computing with Intel Phi coprocessors'tag:typepad.com,2003:6a010534b1db25970b01b7c78d2ba8970b2015-05-19T23:23:54Z2015-05-27T18:15:08ZRadford NealI hope you realize that you can compute the 17th power of A with five matrix multiplies (compute A^2, then...<p>I hope you realize that you can compute the 17th power of A with five matrix multiplies (compute A^2, then A^4, then A^8, then A^16, and finally A^17).</p>
<p>You might also be interested in pqR (see pqR-project.org), which can automatically parallelize various R operations. It probably wouldn't help for this calculation, however, which will be dominated by how well your BLAS does matrix multiplies.</p>
<p>By the way, if your cores can run two threads (via hyperthreading), there may appear to be twice as many processors, and it may appear that a program not using these extra processors is using only 50% of the CPU, but it's probably really using something like 85% of the CPU, since two threads running on the same core are far from being the equivalent of two threads on separate cores.<br />
</p>myschizobuddy commented on 'Fast parallel computing with Intel Phi coprocessors'tag:typepad.com,2003:6a010534b1db25970b01b8d116b43e970c2015-05-19T21:23:56Z2015-05-27T18:15:08Zmyschizobuddypost links from where you bought the motherboard and the phi coprocessors for so cheap<p>post links from where you bought the motherboard and the phi coprocessors for so cheap</p>Jake commented on 'Fast parallel computing with Intel Phi coprocessors'tag:typepad.com,2003:6a010534b1db25970b01b8d1168994970c2015-05-19T16:30:25Z2015-05-27T18:15:08ZJake Also I just noticed they were markov matrices, so you're going to have a very friendly eigenspectrum.<p>Also I just noticed they were markov matrices, so you're going to have a very friendly eigenspectrum. </p>Jake commented on 'Fast parallel computing with Intel Phi coprocessors'tag:typepad.com,2003:6a010534b1db25970b01b8d11688e6970c2015-05-19T16:26:21Z2015-05-27T18:15:08ZJake Those numbers don't sound right at all. Numpy finds the product of two 5k*5k matrices in about 6.5 seconds on...<p>Those numbers don't sound right at all. Numpy finds the product of two 5k*5k matrices in about 6.5 seconds on my macbook air, raising to 10**17 should take about a hundred multiplies (depending on what exactly the power is), which is about ten minutes. </p>
<p>In your other example, raising a 10k*10k matrix to the 100th power should take 10 multiplies, each lasting about 61 seconds, or about ten minutes.</p>
<p>Also if your matrices are symmetric, maybe try doing a partial SVD, and then you can just write down the SVD of the power matrix, and multiply it back out. </p>