DS-I Africa Virtual Networking Exchange Networking Round 1 – Open Science Network
hi everyone um my name is wisdom Aku um one of the facilitators for this room I take this opportunity to welcome you to um this aspect of the network exchange I will be facilitating here with suit pangi I’m not sure uh SMI if you want to say a word before we can proceed uh not thank you for everybody for joining us and we’ve got a very exciting lineup of speakers and thank you wisdom for facilitating the session right so um just as uh some ground rules so we will have we have actually two speakers um each person will take about eight minutes so generally you will have eight minutes but then because we have two speakers will have more time and so uh after the first Speaker uh we go for the second speaker and then we have General try for the question the q& a um session so I would just like to encourage you because of the time um if you can introduce yourself in the chat and then we will go ahead to start with our first speaker and I just want to know if we have uh shimo am here thank you very much is coming from the African open science platform you you have the floor okay thank you very much let me see if I can project here for for you can you see my screen yes uh just put it on presentation mode okay can I see it now great yes proceed okay great I’m very very glad that you are generous with time I was conscious that 8 minutes in as much as is sufficient for introductions we may lose um some important information so really I’m excited to be here um coming in to give a presentation on efforts in the continent to build a truly panafrican platform uh to connect us to engage in advanced open science in the continent across his tenants I also wear another hat because I do engag in the DS Africa um initiative through engagement in the ilazi project I do sit on the science Advisory board for that and I think it’s an ample opportunity with this type of networking sessions to connect the networks so quickly in terms of the African open science platform you’ll see I have listed a number of things there all those are very very pertinent to what we are trying to do in the continent we’re trying to make sure that you can influence the development of the requisite policies that overarch everything as you can imagine in the science system the issue of infrastructure is very critical to to Africa there’s a lot of African trained researchers fully qualified scientists who been trained worldwide but they cannot do their assigns in the continent because they cannot have requisite access to infrastructure we all know the issue of data is very critical regarding um the open Agenda and how we can purposefully share data to make sure that we can do more science uh the issue of skills critical to Africa as a continent we know that Africa is very young in terms of um the age average we want to influence the development of requisite skills for doing science in the digital era in the continent we all know that open science in a way in as much as it’s not a new phenomena it’s also a new phenomena in terms of how we do things so incentives to war our approach our entrenched way of doing things in the new dispensation is quite important and of course the last two elements there collaborations and Partnerships we need this in the continent to be able to make inroads so we were asked today to have um inputs on those elements and I’ll try in my presentation to to do exactly that I don’t want to dwell too much all of you I believe know what open science is but I think we need to emphasize that science must be science for transformation and it must also impact Society so the open science agenda really um premises that at the very core we are putting Society at the center of all that we do of course there are various elements of it regarding open access to to Scientific literature not just to PE but also communicating results to society I think that is critical and then of course the ele elephant in the room regarding this data that you all know emanates including from public SP research what do we do with it um I think the value proposition and the the the case for open science is well understood I’m putting that slide there for the benefit of some who may be new in terms of this thinking but you can see that um it’s really not earen or new thought good scientific practice depends on communicating evidence if you do good science it should be natural to you to to share the evidence of your research together with your Publications and it’s very clear that this type of doing work has really transformed a lot of areas I mean genomics astronomy environmental science traditionally have been very open with the with the with the data sets that they have and of course the value proposition also is just that if we do this um we can accelerate obviously scientific discovery more um lot of good work has been done in terms of formalizing and studying this value proposition I’m sharing that again in terms of your information um the UNESCO recommendations have been approved by all our countries or most of our countries including in the continent and then there’s a set of recommendations that have got implications to everyone at various levels at the national level at the institutional level at an individual level as a researcher that I think indeed be very very important to introspect and see going forward how much we are doing uh in terms of those uh action items indeed there’s a monitoring and evaluation framework in place to see how all we are doing all of us are doing in terms of attaining or or implementing aspects of those recommendations the first UNESCO open science Outlook was launched in Geneva we were in s in December um as the first iteration but there will be sequence of quarterly Sor every four years reviews done on how countries organizations and everybody is doing regarding this so really I’m sharing this for information so that participants here can be arested with what’s happening and bringing it home because I think this is critical this has to be domesticated uh in the continent in our organizations in our countries to say what does it mean um open SS and how do we as individual countries individual organizations implement this Visa V our mandates a very good study was done culminating in a report in 2019 to really see what’s happening in Africa in in terms of the this the landscape across the various Dimensions um the detailed report link is shared there but there were very key things that were also uncovered during that study including the barriers to actually doing open signs in the continent I think this is very critical because a baseline study would inform future actions indeed you’ll see in this presentation what ultimately culminated but the glaring Gap is exactly in some of those areas that you see and we know that investment in Africa in research is limited but there’s no shortage of people willing to work with us including for example through DSi Africa NIH and others to work with us to advance elements of Science in the continent it brought us to as a community in the continent to develop um a strategy of how we can change this dispensation a detailed strategy um that was developed through a consultative process is available online at the ASP website right that really shows a Visionary Progressive future looking strategy with pillars that we can start implementing especially leveraging on the capacities that we have and it’s critical that those elements that you see there we are purposeful in and pragmatic in advancing them I mentioned the issue of infrastructure um I think critical also is the issue of having some sort of facility virtual or otherwise to Galvanize us around data and data science uh which is why I think DS Africa is doing great work in that particular Dimension I think the issue of having projects that will allow us to tease out the interfaces between us and advanced open science the open Agenda is critical and those projects have to be founded or based on the needs and the challenges in the continent across different spheres including Health Agriculture and the like so it’s very critical that we find these projects and work on them together and have impact um I do also want to mention the elements of developing the necessary networks the the the platform that you are using today to connect us is very critical it’s a good Network platform so this issue of engaging a network for dialogue around open science is critical and of course a similar Network for making sure that we advance training and skills so really in summary we want this platform to conven us and coordinate the many fragmented activities and interests in the continent to make sure that we can pull the resources pull the expertise amplify impact and do things together because sometimes more often than not things are fragmented and some things that are very good do not percolate to other aspects to other spheres of the continent I do summarize this in this particular slide and I think um often very good work is done in the continent regarding research but we not showcasing it enough and there’s a misal that not much SAR can happen in the continent but indeed it is U we do have others who have um invested in in some capacities let’s share those capacities we do have people doing things all over the continent let’s let’s pull those activities those resources and and and amplify impact so really I think that’s a no-brainer um I think the value proposition that we develop is compelling and I wanted also to share it with you um so that um as per the the intentions of this networking platform you can appreciate what ASP is our goal who our stakeholders are and what we intend and what we are offering to those stakeholders I’ll share this with you so that we can keep it and of course the benefits of partnering with ASP why would you want to benefit why would you want to partner with ASP given what you already doing and I think it’s quite compelling and nobody can really argue uh with with this particular benefits if you’re a government we are developing science policies by all means you may find it very useful work with ASP and ASP Partners in helping you develop your open science policy Frameworks and uh maybe how to best Implement them if you an institution um you want to attract intal collaborators the ASP network is growing um it’s connecting other network um you’ll benefit from engaging us if you are a researcher and you you want to up skill and and get trained on elements research data management and other things you may find very Progressive to engage with asp if you a funer and a donor you want the dollar that you invest in these research calls to go an extra mile you want to have impact in terms of what you do so working with ASP can can really Advance your your agenda um we do also have a details on on how ASP delivers on this mandate how you can work with us in Partnership because in the end we’ll have to work together to to implement these areas and of course sustain abil is key so really I think this value proposition can communicate vividly what we about in terms of the implementation ofp ASP has got a coordinating Hub um just start it’s a coordinating Hub that is based at the NRF in South Africa um ASP is hosted in South Africa by virtual being hosted by the government of South Africa for Africa for the duration uh of the period and um the coordinating Hub is based at NRF NRF is a national research facility that also host very large National research infrastructures and I think this provide very strategic uh inputs to the coordinating Hub to be able to leverage and also provide linkages in the continent and then in terms of the Continental operationalization ASP um has a footprint and intends to have a full footprint in the continent we are fortunate at this stage to have um three regional nodes some of the nodes are represented here in the the networking session be good to you to chat with some of them in southern Africa we got the buet alliance which is a network of endrance in the they have a very large pool of stakeholders through universities and providing service over the years and um going forward they will also be helping us expand on the on the issue of open sits in E Africa we are very very excited to have the African Institute for capacity development based at jata University of technology in agriculture um very very good Institute IAD that has got a mandate in East Africa and they’ll be working with us to advance open signs there North Africa we do have the National Authority for remote sensing Ria is here also they will also coordinate they we’re looking forward to having future calls where we can have West Africa and central Africa represented Africa is a large diverse continent nobody can pretend for a second that we we we can do what we want to do without this representation that you see there and of course for every region there’s alignment with science policy it’s very very important for open science to talk to existing policies and existing programs and further existing continental and Regional agenda so really that’s the operationalist modality um if we can get this right Africa can be represented in the global State through engagement in the global open science platforms because we’ll be organized we can plug in um of course we can plug in if our structures are also covered I think there is a blueprint of what is required to to be a reasonably efficient open science platform or Cloud for you to be able to get your hooks into engaging in the global platform for engaging for those collaborative project that require all of us if you look at climate change you look at biodiversity you look at um epidemology these things are Global challenge that requires to be organized so indeed by organizing ourselves this way as Africa we can the issue of data we mentioned is very critical um you recall that one of the tenants is to make data Fair um an influential paper was done 10 years ago just last two months ago we met in January to discuss and reflect on the past 10 years what has been done and what needs to be done to to make data Fair ASP engages in platforms to represent African voices and I think it’s critical for us to also as the continent have a position on some of these things and indeed as communities of practice like Health also think about how you can work towards turning your data making making it fair turning that into reality so ASP provides linkages in these platforms to assist our stakeholders we also influence policym as I mentioned because there’s a very big gap in terms of policy in the continent um other countries have started but you can see large there’s a lot of work that to be done so guidance is required which is why we engage Regional member states Regional blocks to try to assist and engage them uh to position open science as a key initiative that they also need to push in addition obviously and in parallel or at the same time as the already existing Science and Technology policies and indeed um this is very welcome some progress we made in other regions for them to bring open science into the mainstream but colleagues what is critical is the need for projects which is why I think project like DSi grateful to the NIH funding um to advance this area of data science but what we need is more of these projects and when we do have these projects they must have a continental footprint or there must be an effort to make them Continental there’s a lot of work happening in the continent often is not showcased often is also not acknowledged so it’s very very important that I think as a cont we organize ourselves so that those that want to engage with us find that we organized we are reachable and when there is a call on a particular area we know who to talk to not just our small click of friends that we’ve been working with for the past 20 years but we try very hard to engage um on a continental scale colleagues there’s a lot of opportunities in terms of advancement of um cross disciplinary areas because some of these problems are trying to solve cross disciplin finally very quickly I want to show you some of the recent proposals where ASP reached out to the networks to try to build stronger consortia some of which you may be aware of you know about the El project that has been mentioned there’s also was a recent call regarding the EU Africa Partnerships regarding e research infrastructures things like bio Banks where are the bio banks in Africa in terms of data for that there are some proposals that we engag in across those spes the issue of funding colleagues I think as a continent we cannot always be this the last slide all right we need to organize ourselves also Continental in terms of funding for research indeed as ASP we are trying to plug into those research um funding initiatives that involve African funders finally finally the issue of conferences we need to have Continental conferences to help us dialogue and meets like this I’m glad to say that in 2027 Africa will be hosting IDW 27 international data week prior to that in 2018 was hosted here in Africa as well in B we just came from Visa Africa in Kenya hosted by the East African node again advancing this dialogue there’s an upcoming conference in December collaboration with NRF and ird French and the International Conference on open science clouds and hopefully going forward colleagues will have more and more of this finally if you want to engage with us the B is light we need your participation across those fees if you a service provider regarding computational facilities software tools you got expertise in developing policy with experti research data management reach out to us um we are building a data science and a Institute we want those project are talked about we are creating networks so thank you very much chair for your Indulgence and for giving me so much latitude and making this presentation thank you thank you that good um thank you very much for the presentation on your advocacy on open nice so as I indicated we have two speakers and we will go ahead with the second speaker and while that is going on I would encourage you if you have any questions you can post that on the chat box and so our next speaker is f buer I hope I got the name right who is from the center for translational data science G 3 please go ahead good morning everybody um are you seeing a screen that’s showing a presentation because it doesn’t look like it’s no um on myself never mind I will stop that share let just put that on presentation mode okay there we go excellent all right good morning everybody my name is Fay Booker I’m a senior scientific support analyst at the center for translational data science and today I’m going to talk about our open-source software data platform for data sharing and Analysis we call it gen 3 it’s not a particularly good name but we will work with it so I’m going to tell you a little bit about gen 3 I’m going to talk a little bit about what a data Commons is and what our gen 3 capabilities are and then I’m going to show you a demonstration of both the data Commons and a data mesh which is the ability to go move over a number of data Commons and bring all of that information together in one place so Gen 3 is uh built and maintained by the center for translational data science were housed at the University of Chicago we develop and operate um data platforms to to support research and top of societal interest so for some reason if you’ve never been to the well we’re in Chicago which is in sort of in the middle of the country um this is a nice picture of the university and this is our staff as of a year ago who happened to be in Chicago on a day when we all got together so Gen 3 is built around the concept of a data Commons and a data Commons is a infrastructure that collocates data store storage management and Computing infrastructure with tools used for analyzing and sharing data so we are curating a fair data environment uh in just the way that the open uh data Consortium was just discussing so we are open source our entire source code sits on GitHub we provide core micr servfaces that do things such as user Authentication authorization so who are you and what do you have access to we can store and manage metadata which is large both structured and unstructured unstructured metadata is sort of information about how a study is collected structured metadata may be actual physical clinical data about subjects and cases we also enable the ability to upload and download files as well as move files from one location into a cloud computing resource um we mint uh permanent digital IDs called guids um and that’s uniquely identifies something over the the course of of its life cycle uh we enable the ability we have tools that allow you to query metadata and file data uh it uses graphql and we have a web-based user interface for data management exploration and Analysis um so I said Gen 3 is not a particularly use uh unique name if you Google gen 3 we are probably number five on the list uh but this is the third generation of this technology uh we use Docker containers and we also utilize Cloud Automation in the form of kubernetes um user data access and compute resource authorization can be tailored to individuals um and as I mentioned before structured metadata and files assigned a unique permanent ID so that everything is easily identifiable so I’m going to break out of uh presentation mode in just a second um and I’m going to show you some resources that are powered by gen 3 and then I’m going to show you um a quick demo of the medical imaging data and Research Center midc uh and also the biomedical research hub which is a data mesh so you should now be seeing I hope um a window that says powered by gen 3 yes yes all right I did it correctly for a change um so we are largely based in the United States and these are a number of repositories that we support um many of you who do genomic research are probably familiar with the Anvil you may also be familiar with the cancer research data Commons crdc um but we also support biot Catalyst goml uh a major uh project on opioid addiction called the heel initiative as well as jco which is hopefully somewhere else on the page but we also have some International um collaborations uh right now we’re particularly active in with the Australian cardiovascular data Commons uh based out of Melbourne and genomics Ooa which is based in New Zealand um we also are part of a a large Consortium here in the US uh studying pediatric cancer um and if you go to this particular page you can click on any of these repositories and go visit them all right I want to show you just the sort of basic setup of any research data Commons um a data Commons brings together data from one or more locations harmonizes it to a data dictionary and makes the data searchable um it’s particularly useful for identifying cohorts um and determining whether they’re adequate sample sizes um so this particular every gen 3 data Commons has a dictionary um which is not always the easiest thing to view um and you can view it as a graph which I’m sure everybody is thrilled about you cannot see that particularly well but you can actually click through the various nodes you can see where it exists in the data model structure you can actually open its properties and see what is required and what is not um it’s a good way of just getting an overview of What kinds of things are available um and that under everything and we harmonize data to these um table properties but the most interesting thing in fact is the ability to search across studies so mric um is a uh Imaging repository so it has a variety of image formats from CAT scans to digital to MRIs Etc um so we have a number of Imaging studies um you could pick the kind of modality so what kind of images are they um the kinds of descriptions you can pick particular ages um of the patients at the time um let me see what body part was um examined in the image and so on and so forth and as you pick as you pick these things you see the Imaging data change so I could say was it a CAT scan and I’m down to 4,000 Imaging studies from 300 um I can say was it a chest with contrast and it continues to reduce um for Imaging studies we also have a builtin ability to view those studies in a dcom viewer so if we click on one of these and let the internet do its fun stuff which hopefully won’t take too long eventually we should see an IM pop up but I don’t want to belabor this so that part of the presentation isn’t as sexy as it oh here it comes so these are the images that were supplied we have uh worked with the submitters to strip uh any demographic data off of the images so things like names gender sex age are typically not available on the images themselves but they are available in metadata um and you can just scroll through those images [Music] um as you as you desire it’s a little slow on my end so I’m going to go ahead and back out of this um but um once you’ve selected a cohort you can download a manifest uh which is files um that you have selected and you can use gen 3 tools to download the data of Interest um or if you have a work um a workspace identified you could actually use gen 3 tools to transfer the data into that workspace um we do require you to um log in for downloads um and there’s a login button over here I’m not going to show those examples but that’s how we control access everything in this repository happens to be open um but we still like to track um who has downloaded things um and so that’s the overview um we also provide additional context so we um people that are new to medric we provide you with some example analyses so these are all Jupiter notebooks um that show you how to build cohorts select cases and do some basic analysis so we provide you with examples so to help you get up and running um a little more quickly so that’s an example of a data Commons we also have sort of an overarching data mesh um and we have several of them I will show you the biomedical research Hub and let me back out of what I was doing here there we go so BR as we call it um locally um brings to be brings together 567 studies I believe from 10 different data Commons um and you can actually identify which data Commons they are by clicking on this data comments button and then scrolling down to see what’s available can I IND you maybe you should be up I will finish up in two minutes all right thanks hopefully less um but this is just another way of actually um looking at the data this um is more like a data catalog and you can se you can search for terms like cardiovascular or something else um and there are little buttons here that tell you whether the data is available so a locked a locked lock means you don’t have access to the data um but there are open ones you could then select a data set and if I had access to it I’d be able to select it and actually pop it into a workspace um we’re working on strategies right now to build and bring your own next flow workflows into workspaces um and so this is kind of an exciting time for us so let me just pop back to my presentation um you can learn more about us um at gen 3.org um the center for translational data science at the University of Chicago uh we host community events about every month uh and we have our own YouTube channel with the videos from their those events that talk talk about new things happening in gen 3 um our seource code is on GitHub um and as always you can look and see what who’s using gen 3 by visiting or powered by gen 3 page and that is all I have thanks very much F and a big thank you to both of our presenters um at this moment I will invite uh some questions some comments from the audience now if you have any questions or comments please put up your hand and then I can acknowledge you and you you make you make your pass your comment otherwise you’re also welcome to post your uh your your your your questions on the chat box suit do you have any comments um no no comments to me thank you so I’m just watching and [Music] see maybe I I can ask f a question all right go ahead s uh F so um as part of ilazi ODSP we we we also use the Gen 3 technologies that you guys create I’m just wondering is Gen 3 used anywhere else outside of um the US um like in any other countries like in Europe or in South America so um I I think I mentioned this uh we have have Partners it’s been used in um Australia and New Zealand we also have a project starting in Taiwan um a couple of we also have had some research collaborations in um Singapore uh so it’s out there it’s not as broadly used outside of the us as we would like um we are starting as you just mentioned um some work with um elazi uh we’re working uh to set up a gen 3 Commons uh with ilu um so we’re we’re we we are becoming a little bit better known I mean I think one of the advantages is we can do both data storage controlled we can we can do a lot of the backend operations that um groups like terara or Seven Bridges um some of the commonly used Cloud resources uh don’t aren don’t have on the other hand they do things better than we do so they have better workflows great I think there’s a question as to whether you have data training data for diabetes melus oh that’s a good question um there might be some on the uh the biodata Catalyst um bdcat um I know that we Gino Mel used to have some diabetes um data sets but I am not sure what its current status is um but I’m happy to um take a look and see if I can find anything it would be us-based data though so I’m not sure if that would be as useful and then somebody is filling up with uh data sets for lymphatic fasis in endemic regions that I don’t think we do um but here again I would you know definitely encourage you to look at um our repositories and see what is available all right thanks any other questions let me just check if that are [Music] hands I don’t see any hand and CH will you have like to say something extra that you might need to yeah yes indeed um just just appreciating the presentation colleagues uh and indeed you can see that um there are some capacity ities elsewhere in the continent um in terms of Partnerships and relationships be made and as I emphasized in my slide very very important to amplify the impacts of these relationships that maybe individually we have uh with some competent capacity Partners um for example the issue of um training I think is a critical one um in the continent um if you look at some of the possibilities uh regarding those that do have expertise including the present that has been made I’m sure there could be L hanging fruits regarding some generic training in around the issues that their technology for example uh uh look at and I like the questions that I see on the on the on the charts do you have data sets for this do you have data sets for this and I think this just goes back to the presentation also that we made where is the data in Africa about things that we want to study is it the case that it’s there but we don’t have access to it uh it’s not organized properly or is it the case that um we are not collecting the data we should be collecting what are the issues and I’m glad these questions are coming up in this chart um I think it gets get us to think and if we do have that data um where do we host it in terms of the infrastructure and how do we motivate uh resourcing for this infrastructure in the continent to where we can host these things and remember with this data if properly curated and clean data we can do interesting things everybody’s talking about artificial intelligence and infering new Knowledge from the DAT that we have until we get these fundamentals right um will be disadvantaged in terms of exploiting and what we have so so really I’m very glad that the issue of data is coming up um the of the infrastructure and how that infrastructure would look like is coming up um so I think these are the these are the million doll questions that we can start addressing in in incremental steps and as ASP we are very keen to work um across the board with stakeholders uh to try to make this incremental steps thank you right thanks very much so um I would like to thank all of you for coming through um our time is actually up and we need to go to the next session and so you can contact any of our speakers via the contacts that they shared and I look forward to meeting you in the other rooms thank you all very much