Why Co Packaged Optics Failed In 2011 And Why It Wont This Time Ibm Europe Jose Pozo Cto Optica
read summary →TITLE: WHY CO-PACKAGED OPTICS FAILED IN 2011 - AND WHY IT WON’T THIS TIME : IBM Europe | Jose Pozo CTO Optica CHANNEL: Optica Corporate Info Channel URL: https://youtu.be/-Y6UQEGG2N4 DATE: ---TRANSCRIPT--- One, two, three. Welcome back everybody. I hope you all enjoyed a great lunch and you’re all fed and watered ready for this next uh session where we’ll discuss through a keynote talk followed by by experts from three incredibly innovative startups on how Phutonix can enable the next generation of AI in both compute and the networking layer. Please allow me to first welcome and give the floor to Professor Bert Afrain. I hope you won’t mind me. uh saying industry veteran and manager of co-packaged optics for IBM research in Zurich, Switzerland. Bert, the floor and the attention of everyone is yours. Welcome. Thank you.
Thank you. Thank you, John. Okay, a veteran. Let’s see. I want to start with this picture. This is the IBM research lab in Zurich where I’m working. And it’s the first lab for IBM outside the United States. And I show this picture because this year we celebrate our 70th anniversary just for your information. So we are very well embedded in the European research landscape connections with universities, companies but of course also very strongly integrated in IBM research and IBM business units uh in the US and in Canada out of my team for example. So I am embedded in the semiconductors um field um more specifically in chiplet and advanced packaging. So my presentation I want to start with this picture um talking about veterans. This is roadrunner. It’s uh the first supercomputing system um that reached one petaflop. And why do I show this picture? For several reasons. Um the first one is and we’ve seen this before. Um talking about say the scaling of AI inference and training and the compute effort. If you think back of the previous picture 2008, the strongest supercomputer is that long ago? 2008 or not? Okay. As a veteran, I would say that’s not long ago, right? But this system now needs 10,000 days to train um one of the modern neural networks. Just to put this in perspective, just not looking at a graph alone, right? So what is it that it’s all running on? It’s the hardware that say is the transistors, the logic, the circuits and the memory, different technologies for memory. Um and all of this continue scaling and the transistors whereas maybe say 10 years ago we were saying semos scaling is over now we see things continue but especially what is important is scaling of packaging bringing memory closer and closer to the accelerator and the processing unit going from for example the dims that we had 101 15 years ago to high bandwidth memory packaged closely on the um on the carrier substrate and of course communication um bringing it close on the package as well. And IBM continues to be involved in all these developments through collaborations also out of research with semiconductor companies with um manufacturing and oxide companies in order to advance this technology and out of say research and also out of my team in Zurich we are directly u connected to this. So talking about optics, we’ve been talking a bit about the real challenges that we have in optics. I think the basic challenge of optics is that it is complicated. So if you just look at an electrical link, you have copper wires, maybe you have some connectors in between, but your signals start electrical, the end electrical, and all in between is in principle relatively simple. Now look at this. If we go to an optical system, um we need this tremendous amount of additional components, the lasers, um the modulators, special drivers to drive these modulators. And that once we have all these super complex interfaces, optical that need to be clean, very accurately aligned, multiplexors, de multiplexers as well as amplifiers. So optics has a lot of advantages but it comes also with a tremendous amount of additional effort, additional components and additional assembly effort. Um and I think that is the basic challenge that we need to address. I was looking back through my slides that I was presenting talking about a veteran again. This was a slide I had back in 2011 and this was our road map and as you see it was still a little bit an old-fashioned computing system with these memory dims but for the rest we had again this system from
Why do I put it here? This was the first computing system where pluggable optics was applied. This roadrunner system and then in 2011 IBM built a system called perks. I will show you a bit more about that on the next slide which was a supercomputing system which was the first system where in the end cop co-ackage optics was actually applied. We didn’t call it co-ackaged optics at that time. We said first level package optics or something like this but it was co- packaged optics. Um and then we had this cool road map um all the things we are talking about further integrating optics into the system into the board making massive connectivity. We did a demonstrator um at that point project called Terabus where we had polymer waveguides on the board and we could do massive connectivity between two kind of processor packages. um and then going to deeper integration electrical and optical also at chip level. So that was 2011 and IBM then built this system as I mentioned and let’s have a little bit of closer look what did this now mean. So you see here this package with the processor or in this case it was a ship switch chip and you see there 56 sites where the optical transceiver can be assembled and this was a vixelbased system. So each transceiver had I think 12 channels first at 10 later at 25 Gbit per second. And now just imagine you build an electrical system. What do you do? You assemble your chip on your carrier substrate. You set your put your carrier substrate onto the board and you’re done. In electronics, you just make thousands of connections in one assembly step. Now, we have the optics in addition. So, at once, you need to build all these optical components with all the building blocks that we discussed. Uh, but you need to assemble them on this carrier substrate 56 times. And as a next step, you need to route the fibers and need to make sure that every fiber or fiber cable from every element finds the right spot at the front of your chassis. So it’s a huge effort. So what happened at that time in 2011? We thought, wow, this is it. Now optics will be there massively. It didn’t happen. Industry found ways around using optics. If you also look at the way um say the amount of communication that was in the system um over time it reduced tremendously. In the beginning it was roughly one p one bite communication per flop. Now it’s less than 100 of a bite. Right? So all this did not happen at that time. But here we are again and you’ve seen the chart the chart uh on the lower right hand side before. Um what we’ve seen over time is this tremendous scaling of compute and the lack of the communication to the memory as well as the overall system communication and we are now in the AI era and things have changed and in fact we see a kind of reversal. Um in the past IBM was building systems with multi-chip modules where you had many processors on one carrier substrate. Um that was reduced because compute advanced so quickly um and now we are in the situation that we go back to multi-chip modules um and we need to advance the communication by bringing the optics closer. So here we are again. But what can we now do to overcome that the same issue will happen again? And I think that’s based on what I stated before. We need in some way to overcome the overhead that optics brings in. So what I want to do is discuss a few of the concepts that we are working on. Um and let’s see. So I’m in research but we have a direct collaboration with uh an IBM OSET um business unit in Canada in Bmont. Um they are key for also doing a lot of the assembly for the IBM mainframe systems as well as for the data central systems from IBM. They do modules subasssemblies and and full systems. Um and on the module side it’s copackage copper direct fiber attach with V-grooves as well as um copackage optics um polymer waveguide attach. I will show you a little bit more about that. So this technology is available we use it for in-house as well as as a service for um external companies. roughly 20% of the whole um service uh done at BMont is for IBM internal 80% is for external clients. So just to show you a little bit of visualization on how we envision this polymer waveguide system. Um you see the carrier substrate and then you see here this fan out of the uh optical chip where what we do is we attach through adabatic coupling a polymer wave guide to the optical chip um to have a low loss connectivity between the chip and the polymer wave guides. um we can do a very high density interconnect at the chip site and then do a fan out as you see indicated here in order to do a connectivity to um to um a fiber connector an empty. What is very critical is that if we do these kind of assembly concepts um that we are able to integrate this whole process flow into um electrical assembly. So what we’ve done is make sure that um all the processes and also the say stability for example these polymer wave guides are solar reflow compatible which is what you see here. So this polymer waveguide approach um that we are using on the right hand side you see a little bit how this adabetic coupling concept is working. You have the silicon waveguide or the silicon nitrite waveguide on chip. We taper that down. Uh and through tapering it down, you force the light adabetically to transition as a super mode that is for the white silicon waveguide completely in the silicon to transition into the polymer. And the polymer waveguide in our case is made compatible um from a size mode size to the fiber. What do you do you you see on the left hand side is a way how we could for example extend this concept. So currently we are attaching a flex cable and um so it’s a pure optical connect but if you think about putting these polymer wave guides onto the carrier substrate um and then do a flip chip attach as you see indicated there. Um you can imagine that we have a direct simultaneous electrical and optical connectivity of the uh chip to um to the system. Right? So in one assembly we do electrical as well as optical attach. So we did not yet show that but one of the things we showed is the optical connectivity. What you see here is a glass substrate with these polymer waveguides. Um and we have a silicon photonics chip with waveguides with these adabetic tapers. We do a flip chip attach of this chip onto this substrate with these polymer waveguides. Um and transition then from one input waveguide to another in order to visual visualize and see that we indeed have the coupling. Um and what we do here is in one attach we make 100 optical connections just in one flip chip attach. Right? So that’s one of the challenges that we have in optics. While in electronics we can do say many connections at the same time in optics it’s just still a few and this is a path forward to really get to many connections simultaneously and then potentially even simultaneous with electronics. um one step further and this is a project that we do together with DARPA is if we think about electronics um and we have a connectivity between chips mounted on a substrate um we have the connectivity in the silicon chip in the back end of line um the C4 attached to the substrate and then the connectivity through the substrate all electrical the goal here is to build simil similar functionality in the optical domain. So super high density optical interfaces. Um the goal here is say to do 3D routing of optical waveguides um where the pitch between the waveguides is as small as 3 micrometer um with integrated turning elements for say in plane as well as out of plane redirecting of the light. um and then connectivity with bonding of chip to chip um connectivity through the substrate as you see indicated here. So this is a project we are currently driving. I still view this as something that is say more clearly further out but basic elements already I think also make sense to the discussion we are having here today in optical connectivity for CPO. So here you see a few of the basic building blocks we need to make through optical VAS, vertical VAS, the turning elements in order to go from vertical to optical connectivity. Um and these are all processes we are establishing in um IBM Yorktown Heights in the US as well as V and Zurich in our clean room. Um what you see here is an example of these um vertical waveguides that we have realized that we can measure from the top and then measure in the end the resonance um of the light coupled in in order to estimate what kind of losses we have just as a first visualization of this. Um similar um these are lateral mirrors um integrated for example in a ring resonator to estimate the losses. um we are currently at roughly 1 dB need to improve this further. We also have already similar losses to um for the vertical mirrors. So I view this kind of technology as a step also on the shorter term to get high density interfaces and say vertical redirecting of the light as is required also for example for detachable connectors. Um but overall uh what I want to say is we need more optical communication. Um we need tight integration of the optics as already presented by several people. Um uh of course several technologies are under evaluation. Um and the kind of technology that will say be applied for the various applications just depends on the local requirements. But a basic challenge and I think that will also drive a lot of say the choices that are made for these various concepts. The main aspect we need to in take into account is how to handle this overhead that optics brings and I think that’s on the one hand through integration wafer level assembly. Um but I think this should not be limited to say the first level package. We also think need to think about as shown for example by and others as well Intel. We also need to think about the full assembly in the system onto the board and um say enable massive connectivity without the assembly overhead that we are seeing today. With that I would like to thank you very much for your attention. Thank you so much. Uh I think you know what’s coming. Uh what can you do for others and what can others do for you? The kind of research that we’ve been doing has changed a bit over time. So in the past we were driving a lot of technology development ourselves in research in our clean room and then we did kind of licensing of this technology to potential partners. We’ve done a lot of that in my team. Right now it looks a little bit different. We have come say to a situation we are where we are more um say higher level of maturity. we are doing more say system design, device design but collaborate with partners in order to deliver the technology. So I’m very open to discuss with say you to see what kind of innovation we can bring into this. That’s for sure.
Thank you Darren. Yeah. Hi Darren Burns from Idex. Um, we were having an interesting conversation at lunch about the differences between uh engineering challenges and material science challenges and how often times if it’s a materials problem, it takes a lot longer to solve for sometimes than an engineering problem. Um, do you see we have any major roadblocks associated with materials as we move through this transition? um you know and I think about polymer versus glass wave guides and the trade space between that is that are there a set of material science problems just to deal with still or is this really in the realm of let’s just go solve some engineering problems and be done with it oh for me it’s clearly more than engineering problems I would say right I mean there’s still a lot of exciting material related aspects that need to be solved I think also in the presentations that we’ve seen before I I mean to some extent I’m really amazed how new functionalities arise because of new material concepts or new processes that are being used. Um I think in the end this also goes into the engineering. I mean if you engineer something out um and you want to make a reliable solution you really have to understand your materials ins and out. So for me there’s not a direct discrepancy between engineering and material science. It goes hand in hand. Please tell us your name and your company. Okay. So um uh my question is um has some overlap. So with the first question so you mentioned this um polymer whip guides what is technology behind um to create such structures is based on 2ppp or griskllithography or conventional lithography. The second question is have you um tested the reliability of the polymer material um for example to check if can pass the this so-called yet reliability standards or reflow um compatibility. Mhm. So so we we do not realize these polymer waveguards ourselves. We purchase them. So this is a quasi commercial offering by another company. Uh but we did solder reflow testing and we did the tordia testing and it passed this. Cool. Thanks. Any further questions for bird? If not, uh let’s thank you so much. Thank you.