Skip to main content

2024 | OriginalPaper | Buchkapitel

12. Big Data Analytics in Bioinformatics

verfasst von : Ümit Demirbaga, Gagangeet Singh Aujla, Anish Jindal, Oğuzhan Kalyon

Erschienen in: Big Data Analytics

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This chapter introduces the intricate interplay between big data analytics and bioinformatics, providing a comprehensive perspective on leveraging large-scale genomic data. Delving into the challenges posed by big data in bioinformatics, the narrative unfolds to explore frameworks tailored for managing extensive genomic datasets and the pivotal role of biological databases. The core focus is applying big data analytics in bioinformatics, spanning the employment of Hadoop, MapReduce, and deep learning methodologies. A detailed case study exemplifies the practical implementation of variant detection in genomes, illustrating processes like data copying to HDFS, MapReduce-based data processing, and the multistep intricacies of variant calling and interpretation. This chapter serves as a roadmap by navigating the synergy between cutting-edge analytics and the intricate nuances of bioinformatics.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
5.
Zurück zum Zitat M. Hassan, F.M. Awan, A. Naz, E.J. deAndrés Galiana, O. Alvarez, A. Cernea, L. Fernández-Brillet, J.L. Fernández-Martínez, A. Kloczkowski, Innovations in genomics and big data analytics for personalized medicine and health care: A review. Int. J. Mol. Sci. 23(9), 4645 (2022). [Online]. Available: https://doi.org/10.3390/ijms23094645 M. Hassan, F.M. Awan, A. Naz, E.J. deAndrés Galiana, O. Alvarez, A. Cernea, L. Fernández-Brillet, J.L. Fernández-Martínez, A. Kloczkowski, Innovations in genomics and big data analytics for personalized medicine and health care: A review. Int. J. Mol. Sci. 23(9), 4645 (2022). [Online]. Available: https://​doi.​org/​10.​3390/​ijms23094645
7.
Zurück zum Zitat G. Cantelli, A. Bateman, C. Brooksbank, A.I. Petrov, R.S. Malik-Sheriff, M. Ide-Smith, H. Hermjakob, P. Flicek, R. Apweiler, E. Birney, J. McEntyre, The european bioinformatics institute (EMBL-EBI) in 2021. Nucleic Acids Res. 50(D1), D11–D19 (2021). [Online]. Available: https://doi.org/10.1093/nar/gkab1127 G. Cantelli, A. Bateman, C. Brooksbank, A.I. Petrov, R.S. Malik-Sheriff, M. Ide-Smith, H. Hermjakob, P. Flicek, R. Apweiler, E. Birney, J. McEntyre, The european bioinformatics institute (EMBL-EBI) in 2021. Nucleic Acids Res. 50(D1), D11–D19 (2021). [Online]. Available: https://​doi.​org/​10.​1093/​nar/​gkab1127
8.
Zurück zum Zitat H. Satam, K. Joshi, U. Mangrolia, S. Waghoo, G. Zaidi, S. Rawool, R.P. Thakare, S. Banday, A.K. Mishra, G. Das, S.K. Malonia, Next-generation sequencing technology: Current trends and advancements. Biology 12(7), 997 (2023). [Online]. Available: https://doi.org/10.3390/biology12070997 H. Satam, K. Joshi, U. Mangrolia, S. Waghoo, G. Zaidi, S. Rawool, R.P. Thakare, S. Banday, A.K. Mishra, G. Das, S.K. Malonia, Next-generation sequencing technology: Current trends and advancements. Biology 12(7), 997 (2023). [Online]. Available: https://​doi.​org/​10.​3390/​biology12070997
10.
Zurück zum Zitat M. Zaharia, R.S. Xin, P. Wendell, T. Das, M. Armbrust, A. Dave, X. Meng, J. Rosen, S. Venkataraman, M.J. Franklin, A. Ghodsi, J. Gonzalez, S. Shenker, I. Stoica, Apache spark. Commun. ACM 59(11), 56–65 (2016). [Online]. Available: https://doi.org/10.1145/2934664 M. Zaharia, R.S. Xin, P. Wendell, T. Das, M. Armbrust, A. Dave, X. Meng, J. Rosen, S. Venkataraman, M.J. Franklin, A. Ghodsi, J. Gonzalez, S. Shenker, I. Stoica, Apache spark. Commun. ACM 59(11), 56–65 (2016). [Online]. Available: https://​doi.​org/​10.​1145/​2934664
11.
Zurück zum Zitat E. Afgan, A. Nekrutenko, B.A. Grüning, D. Blankenberg, J. Goecks, M.C. Schatz, A.E. Ostrovsky, A. Mahmoud, A.J. Lonie, A. Syme, A. Fouilloux, A. Bretaudeau, A. Nekrutenko, A. Kumar, A.C. Eschenlauer, A.D. DeSanto, A. Guerler, B. Serrano-Solano, B. Batut, B.A. Grüning, B.W. Langhorst, B. Carr, B.A. Raubenolt, C.J. Hyde, C.J. Bromhead, C.B. Barnett, C. Royaux, C. Gallardo, D. Blankenberg, D.J. Fornika, D. Baker, D. Bouvier, D. Clements, D.A. de Lima Morais, D.L. Tabernero, D. Lariviere, E. Nasr, E. Afgan, F. Zambelli, F. Heyl, F. Psomopoulos, F. Coppens, G.R. Price, G. Cuccuru, G.L. Corguillé, G.V. Kuster, G.G. Akbulut, H. Rasche, H.-R. Hotz, I. Eguinoa, I. Makunin, I.J. Ranawaka, J.P. Taylor, J. Joshi, J. Hillman-Jackson, J. Goecks, J.M. Chilton, K. Kamali, K. Suderman, K. Poterlowicz, L.B. Yvan, L. Lopez-Delisle, L. Sargent, M.E. Bassetti, M.A. Tangaro, M. van den Beek, M. Čech, M. Bernt, M. Fahrner, M. Tekman, M.C. Föll, M.C. Schatz, M.R. Crusoe, M. Roncoroni, N. Kucher, N. Coraor, N. Stoler, N. Rhodes, N. Soranzo, N. Pinter, N.A. Goonasekera, P.A. Moreno, P. Videm, P. Melanie, P. Mandreoli, P.D. Jagtap, Q. Gu, R.J.M. Weber, R. Lazarus, R.H.P. Vorderman, S. Hiltemann, S. Golitsynskiy, S. Garg, S.A. Bray, S.L. Gladman, S. Leo, S.P. Mehta, T.J. Griffin, V. Jalili, V. Yves, V. Wen, V.K. Nagampalli, W.A. Bacon, W. de Koning, W. Maier, P.J. Briggs, The galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2022 update. Nucleic Acids Res. 50(W1), W345–W351 (2022). [Online]. Available: https://doi.org/10.1093/nar/gkac247 E. Afgan, A. Nekrutenko, B.A. Grüning, D. Blankenberg, J. Goecks, M.C. Schatz, A.E. Ostrovsky, A. Mahmoud, A.J. Lonie, A. Syme, A. Fouilloux, A. Bretaudeau, A. Nekrutenko, A. Kumar, A.C. Eschenlauer, A.D. DeSanto, A. Guerler, B. Serrano-Solano, B. Batut, B.A. Grüning, B.W. Langhorst, B. Carr, B.A. Raubenolt, C.J. Hyde, C.J. Bromhead, C.B. Barnett, C. Royaux, C. Gallardo, D. Blankenberg, D.J. Fornika, D. Baker, D. Bouvier, D. Clements, D.A. de Lima Morais, D.L. Tabernero, D. Lariviere, E. Nasr, E. Afgan, F. Zambelli, F. Heyl, F. Psomopoulos, F. Coppens, G.R. Price, G. Cuccuru, G.L. Corguillé, G.V. Kuster, G.G. Akbulut, H. Rasche, H.-R. Hotz, I. Eguinoa, I. Makunin, I.J. Ranawaka, J.P. Taylor, J. Joshi, J. Hillman-Jackson, J. Goecks, J.M. Chilton, K. Kamali, K. Suderman, K. Poterlowicz, L.B. Yvan, L. Lopez-Delisle, L. Sargent, M.E. Bassetti, M.A. Tangaro, M. van den Beek, M. Čech, M. Bernt, M. Fahrner, M. Tekman, M.C. Föll, M.C. Schatz, M.R. Crusoe, M. Roncoroni, N. Kucher, N. Coraor, N. Stoler, N. Rhodes, N. Soranzo, N. Pinter, N.A. Goonasekera, P.A. Moreno, P. Videm, P. Melanie, P. Mandreoli, P.D. Jagtap, Q. Gu, R.J.M. Weber, R. Lazarus, R.H.P. Vorderman, S. Hiltemann, S. Golitsynskiy, S. Garg, S.A. Bray, S.L. Gladman, S. Leo, S.P. Mehta, T.J. Griffin, V. Jalili, V. Yves, V. Wen, V.K. Nagampalli, W.A. Bacon, W. de Koning, W. Maier, P.J. Briggs, The galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2022 update. Nucleic Acids Res. 50(W1), W345–W351 (2022). [Online]. Available: https://​doi.​org/​10.​1093/​nar/​gkac247
13.
Zurück zum Zitat A. McKenna, M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis, A. Kernytsky, K. Garimella, D. Altshuler, S. Gabriel, M. Daly, M.A. DePristo, The genome analysis toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20(9), 1297–1303 (2010). [Online]. Available: https://doi.org/10.1101/gr.107524.110 A. McKenna, M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis, A. Kernytsky, K. Garimella, D. Altshuler, S. Gabriel, M. Daly, M.A. DePristo, The genome analysis toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20(9), 1297–1303 (2010). [Online]. Available: https://​doi.​org/​10.​1101/​gr.​107524.​110
14.
Zurück zum Zitat F.J. Martin, M.R. Amode, A. Aneja, O. Austine-Orimoloye, A.G. Azov, I. Barnes, A. Becker, R. Bennett, A. Berry, J. Bhai, S.K. Bhurji, A. Bignell, S. Boddu, P.R.B. Lins, L. Brooks, S.B. Ramaraju, M. Charkhchi, A. Cockburn, L.D.R. Fiorretto, C. Davidson, K. Dodiya, S. Donaldson, B.E. Houdaigui, T.E. Naboulsi, R. Fatima, C.G. Giron, T. Genez, G.S. Ghattaoraya, J.G. Martinez, C. Guijarro, M. Hardy, Z. Hollis, T. Hourlier, T. Hunt, M. Kay, V. Kaykala, T. Le, D. Lemos, D. Marques-Coelho, J.C. Marugán, G.A. Merino, L.P. Mirabueno, A. Mushtaq, S.N. Hossain, D.N. Ogeh, M.P. Sakthivel, A. Parker, M. Perry, I. Piližota, I. Prosovetskaia, J.G. Pérez-Silva, A.I.A. Salam, N. Saraiva-Agostinho, H. Schuilenburg, D. Sheppard, S. Sinha, B. Sipos, W. Stark, E. Steed, R. Sukumaran, D. Sumathipala, M.-M. Suner, L. Surapaneni, K. Sutinen, M. Szpak, F.F. Tricomi, D. Urbina-Gómez, A. Veidenberg, T.A. Walsh, B. Walts, E. Wass, N. Willhoft, J. Allen, J. Alvarez-Jarreta, M. Chakiachvili, B. Flint, S. Giorgetti, L. Haggerty, G.R. Ilsley, J.E. Loveland, B. Moore, J.M. Mudge, J. Tate, D. Thybert, S.J. Trevanion, A. Winterbottom, A. Frankish, S.E. Hunt, M. Ruffier, F. Cunningham, S. Dyer, R.D. Finn, K.L. Howe, P.W. Harrison, A.D. Yates, P. Flicek, Ensembl 2023. Nucleic Acids Res. 51(D1), D933–D941 (2022). [Online]. Available: https://doi.org/10.1093/nar/gkac958 F.J. Martin, M.R. Amode, A. Aneja, O. Austine-Orimoloye, A.G. Azov, I. Barnes, A. Becker, R. Bennett, A. Berry, J. Bhai, S.K. Bhurji, A. Bignell, S. Boddu, P.R.B. Lins, L. Brooks, S.B. Ramaraju, M. Charkhchi, A. Cockburn, L.D.R. Fiorretto, C. Davidson, K. Dodiya, S. Donaldson, B.E. Houdaigui, T.E. Naboulsi, R. Fatima, C.G. Giron, T. Genez, G.S. Ghattaoraya, J.G. Martinez, C. Guijarro, M. Hardy, Z. Hollis, T. Hourlier, T. Hunt, M. Kay, V. Kaykala, T. Le, D. Lemos, D. Marques-Coelho, J.C. Marugán, G.A. Merino, L.P. Mirabueno, A. Mushtaq, S.N. Hossain, D.N. Ogeh, M.P. Sakthivel, A. Parker, M. Perry, I. Piližota, I. Prosovetskaia, J.G. Pérez-Silva, A.I.A. Salam, N. Saraiva-Agostinho, H. Schuilenburg, D. Sheppard, S. Sinha, B. Sipos, W. Stark, E. Steed, R. Sukumaran, D. Sumathipala, M.-M. Suner, L. Surapaneni, K. Sutinen, M. Szpak, F.F. Tricomi, D. Urbina-Gómez, A. Veidenberg, T.A. Walsh, B. Walts, E. Wass, N. Willhoft, J. Allen, J. Alvarez-Jarreta, M. Chakiachvili, B. Flint, S. Giorgetti, L. Haggerty, G.R. Ilsley, J.E. Loveland, B. Moore, J.M. Mudge, J. Tate, D. Thybert, S.J. Trevanion, A. Winterbottom, A. Frankish, S.E. Hunt, M. Ruffier, F. Cunningham, S. Dyer, R.D. Finn, K.L. Howe, P.W. Harrison, A.D. Yates, P. Flicek, Ensembl 2023. Nucleic Acids Res. 51(D1), D933–D941 (2022). [Online]. Available: https://​doi.​org/​10.​1093/​nar/​gkac958
15.
Zurück zum Zitat D. Merkel, Docker: lightweight linux containers for consistent development and deployment. Linux J. 2014(239), 2 (2014) D. Merkel, Docker: lightweight linux containers for consistent development and deployment. Linux J. 2014(239), 2 (2014)
17.
Zurück zum Zitat H.V. Firth, S.M. Richards, A.P. Bevan, S. Clayton, M. Corpas, D. Rajan, S.V. Vooren, Y. Moreau, R.M. Pettett, N.P. Carter, DECIPHER: Database of chromosomal imbalance and phenotype in humans using ensembl resources. Am. J. Human Genet. 84(4), 524–533 (2009). [Online]. Available: https://doi.org/10.1016/j.ajhg.2009.03.010 H.V. Firth, S.M. Richards, A.P. Bevan, S. Clayton, M. Corpas, D. Rajan, S.V. Vooren, Y. Moreau, R.M. Pettett, N.P. Carter, DECIPHER: Database of chromosomal imbalance and phenotype in humans using ensembl resources. Am. J. Human Genet. 84(4), 524–533 (2009). [Online]. Available: https://​doi.​org/​10.​1016/​j.​ajhg.​2009.​03.​010
18.
Zurück zum Zitat E.W. Sayers, E.E. Bolton, J.R. Brister, K. Canese, J. Chan, D.C. Comeau, R. Connor, K. Funk, C. Kelly, S. Kim, T. Madej, A. Marchler-Bauer, C. Lanczycki, S. Lathrop, Z. Lu, F. Thibaud-Nissen, T. Murphy, L. Phan, Y. Skripchenko, T. Tse, J. Wang, R. Williams, B.W. Trawick, K.D. Pruitt, S.T. Sherry, Database resources of the national center for biotechnology information. Nucleic Acids Res. 50(D1), D20–D26 (2021). [Online]. Available: https://doi.org/10.1093/nar/gkab1112 E.W. Sayers, E.E. Bolton, J.R. Brister, K. Canese, J. Chan, D.C. Comeau, R. Connor, K. Funk, C. Kelly, S. Kim, T. Madej, A. Marchler-Bauer, C. Lanczycki, S. Lathrop, Z. Lu, F. Thibaud-Nissen, T. Murphy, L. Phan, Y. Skripchenko, T. Tse, J. Wang, R. Williams, B.W. Trawick, K.D. Pruitt, S.T. Sherry, Database resources of the national center for biotechnology information. Nucleic Acids Res. 50(D1), D20–D26 (2021). [Online]. Available: https://​doi.​org/​10.​1093/​nar/​gkab1112
19.
Zurück zum Zitat A. Bateman, M.-J. Martin, S. Orchard, M. Magrane, S. Ahmad, E. Alpi, E.H. Bowler-Barnett, R. Britto, H. Bye-A-Jee, A. Cukura, P. Denny, T. Dogan, T. Ebenezer, J. Fan, P. Garmiri, L.J. da Costa Gonzales, E. Hatton-Ellis, A. Hussein, A. Ignatchenko, G. Insana, R. Ishtiaq, V. Joshi, D. Jyothi, S. Kandasaamy, A. Lock, A. Luciani, M. Lugaric, J. Luo, Y. Lussi, A. MacDougall, F. Madeira, M. Mahmoudy, A. Mishra, K. Moulang, A. Nightingale, S. Pundir, G. Qi, S. Raj, P. Raposo, D.L. Rice, R. Saidi, R. Santos, E. Speretta, J. Stephenson, P. Totoo, E. Turner, N. Tyagi, P. Vasudev, K. Warner, X. Watkins, R. Zaru, H. Zellner, A.J. Bridge, L. Aimo, G. Argoud-Puy, A.H. Auchincloss, K.B. Axelsen, P. Bansal, D. Baratin, T.M.B. Neto, M.-C. Blatter, J.T. Bolleman, E. Boutet, L. Breuza, B.C. Gil, C. Casals-Casas, K.C. Echioukh, E. Coudert, B. Cuche, E. de Castro, A. Estreicher, M.L. Famiglietti, M. Feuermann, E. Gasteiger, P. Gaudet, S. Gehant, V. Gerritsen, A. Gos, N. Gruaz, C. Hulo, N. Hyka-Nouspikel, F. Jungo, A. Kerhornou, P.L. Mercier, D. Lieberherr, P. Masson, A. Morgat, V. Muthukrishnan, S. Paesano, I. Pedruzzi, S. Pilbout, L. Pourcel, S. Poux, M. Pozzato, M. Pruess, N. Redaschi, C. Rivoire, C.J.A. Sigrist, K. Sonesson, S. Sundaram, C.H. Wu, C. N. Arighi, L. Arminski, C. Chen, Y. Chen, H. Huang, K. Laiho, P. McGarvey, D. A. Natale, K. Ross, C.R. Vinayaka, Q. Wang, Y. Wang, J. Zhang, UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res. 51(D1), D523–D531 (2022). [Online]. Available: https://doi.org/10.1093/nar/gkac1052 A. Bateman, M.-J. Martin, S. Orchard, M. Magrane, S. Ahmad, E. Alpi, E.H. Bowler-Barnett, R. Britto, H. Bye-A-Jee, A. Cukura, P. Denny, T. Dogan, T. Ebenezer, J. Fan, P. Garmiri, L.J. da Costa Gonzales, E. Hatton-Ellis, A. Hussein, A. Ignatchenko, G. Insana, R. Ishtiaq, V. Joshi, D. Jyothi, S. Kandasaamy, A. Lock, A. Luciani, M. Lugaric, J. Luo, Y. Lussi, A. MacDougall, F. Madeira, M. Mahmoudy, A. Mishra, K. Moulang, A. Nightingale, S. Pundir, G. Qi, S. Raj, P. Raposo, D.L. Rice, R. Saidi, R. Santos, E. Speretta, J. Stephenson, P. Totoo, E. Turner, N. Tyagi, P. Vasudev, K. Warner, X. Watkins, R. Zaru, H. Zellner, A.J. Bridge, L. Aimo, G. Argoud-Puy, A.H. Auchincloss, K.B. Axelsen, P. Bansal, D. Baratin, T.M.B. Neto, M.-C. Blatter, J.T. Bolleman, E. Boutet, L. Breuza, B.C. Gil, C. Casals-Casas, K.C. Echioukh, E. Coudert, B. Cuche, E. de Castro, A. Estreicher, M.L. Famiglietti, M. Feuermann, E. Gasteiger, P. Gaudet, S. Gehant, V. Gerritsen, A. Gos, N. Gruaz, C. Hulo, N. Hyka-Nouspikel, F. Jungo, A. Kerhornou, P.L. Mercier, D. Lieberherr, P. Masson, A. Morgat, V. Muthukrishnan, S. Paesano, I. Pedruzzi, S. Pilbout, L. Pourcel, S. Poux, M. Pozzato, M. Pruess, N. Redaschi, C. Rivoire, C.J.A. Sigrist, K. Sonesson, S. Sundaram, C.H. Wu, C. N. Arighi, L. Arminski, C. Chen, Y. Chen, H. Huang, K. Laiho, P. McGarvey, D. A. Natale, K. Ross, C.R. Vinayaka, Q. Wang, Y. Wang, J. Zhang, UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res. 51(D1), D523–D531 (2022). [Online]. Available: https://​doi.​org/​10.​1093/​nar/​gkac1052
23.
Zurück zum Zitat D. Szklarczyk, A. Franceschini, S. Wyder, K. Forslund, D. Heller, J. Huerta-Cepas, M. Simonovic, A. Roth, A. Santos, K.P. Tsafou, M. Kuhn, P. Bork, L.J. Jensen, C. von Mering, STRING v10: protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43(D1), D447–D452 (2014). [Online]. Available: https://doi.org/10.1093/nar/gku1003 D. Szklarczyk, A. Franceschini, S. Wyder, K. Forslund, D. Heller, J. Huerta-Cepas, M. Simonovic, A. Roth, A. Santos, K.P. Tsafou, M. Kuhn, P. Bork, L.J. Jensen, C. von Mering, STRING v10: protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43(D1), D447–D452 (2014). [Online]. Available: https://​doi.​org/​10.​1093/​nar/​gku1003
25.
Zurück zum Zitat P.D. Tommaso, M. Chatzou, E.W. Floden, P.P. Barja, E. Palumbo, C. Notredame, Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35(4), 316–319 (2017). [Online]. Available: https://doi.org/10.1038/nbt.3820 P.D. Tommaso, M. Chatzou, E.W. Floden, P.P. Barja, E. Palumbo, C. Notredame, Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35(4), 316–319 (2017). [Online]. Available: https://​doi.​org/​10.​1038/​nbt.​3820
26.
Zurück zum Zitat F. Mölder, K.P. Jablonski, B. Letcher, M.B. Hall, C.H. Tomkins-Tinch, V. Sochat, J. Forster, S. Lee, S.O. Twardziok, A. Kanitz, A. Wilm, M. Holtgrewe, S. Rahmann, S. Nahnsen, J. Köster, Sustainable data analysis with snakemake. F1000Research 10, 33 (2021). [Online]. Available: https://doi.org/10.12688/f1000research.29032.2 F. Mölder, K.P. Jablonski, B. Letcher, M.B. Hall, C.H. Tomkins-Tinch, V. Sochat, J. Forster, S. Lee, S.O. Twardziok, A. Kanitz, A. Wilm, M. Holtgrewe, S. Rahmann, S. Nahnsen, J. Köster, Sustainable data analysis with snakemake. F1000Research 10, 33 (2021). [Online]. Available: https://​doi.​org/​10.​12688/​f1000research.​29032.​2
29.
Zurück zum Zitat J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakool, R. Bates, A. Žídek, A. Potapenko, A. Bridgland, C. Meyer, S.A.A. Kohl, A.J. Ballard, A. Cowie, B. Romera-Paredes, S. Nikolov, R. Jain, J. Adler, T. Back, S. Petersen, D. Reiman, E. Clancy, M. Zielinski, M. Steinegger, M. Pacholska, T. Berghammer, S. Bodenstein, D. Silver, O. Vinyals, A.W. Senior, K. Kavukcuoglu, P. Kohli, D. Hassabis, Highly accurate protein structure prediction with AlphaFold. Nature 596(7873), 583–589 (2021). [Online]. Available: https://doi.org/10.1038/s41586-021-03819-2 J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakool, R. Bates, A. Žídek, A. Potapenko, A. Bridgland, C. Meyer, S.A.A. Kohl, A.J. Ballard, A. Cowie, B. Romera-Paredes, S. Nikolov, R. Jain, J. Adler, T. Back, S. Petersen, D. Reiman, E. Clancy, M. Zielinski, M. Steinegger, M. Pacholska, T. Berghammer, S. Bodenstein, D. Silver, O. Vinyals, A.W. Senior, K. Kavukcuoglu, P. Kohli, D. Hassabis, Highly accurate protein structure prediction with AlphaFold. Nature 596(7873), 583–589 (2021). [Online]. Available: https://​doi.​org/​10.​1038/​s41586-021-03819-2
30.
Zurück zum Zitat N. Sapoval, A. Aghazadeh, M.G. Nute, D.A. Antunes, A. Balaji, R. Baraniuk, C.J. Barberan, R. Dannenfelser, C. Dun, M. Edrisi, R.A.L. Elworth, B. Kille, A. Kyrillidis, L. Nakhleh, C.R. Wolfe, Z. Yan, V. Yao, T.J. Treangen, Current progress and open challenges for applying deep learning across the biosciences. Nat. Commun. 13(1) (2022). [Online]. Available: https://doi.org/10.1038/s41467-022-29268-7 N. Sapoval, A. Aghazadeh, M.G. Nute, D.A. Antunes, A. Balaji, R. Baraniuk, C.J. Barberan, R. Dannenfelser, C. Dun, M. Edrisi, R.A.L. Elworth, B. Kille, A. Kyrillidis, L. Nakhleh, C.R. Wolfe, Z. Yan, V. Yao, T.J. Treangen, Current progress and open challenges for applying deep learning across the biosciences. Nat. Commun. 13(1) (2022). [Online]. Available: https://​doi.​org/​10.​1038/​s41467-022-29268-7
32.
Zurück zum Zitat Y. Kumar, A. Koul, R. Singla, M.F. Ijaz, Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda. J. Ambient Intell. Humanized Comput. 14(7), 8459–8486 (2022). [Online]. Available: https://doi.org/10.1007/s12652-021-03612-z Y. Kumar, A. Koul, R. Singla, M.F. Ijaz, Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda. J. Ambient Intell. Humanized Comput. 14(7), 8459–8486 (2022). [Online]. Available: https://​doi.​org/​10.​1007/​s12652-021-03612-z
33.
Zurück zum Zitat S. Richards, N. Aziz, S. Bale, D. Bick, S. Das, J. Gastier-Foster, W.W. Grody, M. Hegde, E. Lyon, E. Spector, K. Voelkerding, H.L. Rehm, Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the american college of medical genetics and genomics and the association for molecular pathology. Genetics Med. 17(5), 405–424 (2015). [Online]. Available: https://doi.org/10.1038/gim.2015.30 S. Richards, N. Aziz, S. Bale, D. Bick, S. Das, J. Gastier-Foster, W.W. Grody, M. Hegde, E. Lyon, E. Spector, K. Voelkerding, H.L. Rehm, Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the american college of medical genetics and genomics and the association for molecular pathology. Genetics Med. 17(5), 405–424 (2015). [Online]. Available: https://​doi.​org/​10.​1038/​gim.​2015.​30
34.
Zurück zum Zitat S. Haraldsdottir, H. Hampel, C. Wu, D.Y. Weng, P.G. Shields, W.L. Frankel, X. Pan, A. de la Chapelle, R.M. Goldberg, T. Bekaii-Saab, Patients with colorectal cancer associated with lynch syndrome and MLH1 promoter hypermethylation have similar prognoses. Genetics Med. 18(9), 863–868 (2016). [Online]. Available: https://doi.org/10.1038/gim.2015.184 S. Haraldsdottir, H. Hampel, C. Wu, D.Y. Weng, P.G. Shields, W.L. Frankel, X. Pan, A. de la Chapelle, R.M. Goldberg, T. Bekaii-Saab, Patients with colorectal cancer associated with lynch syndrome and MLH1 promoter hypermethylation have similar prognoses. Genetics Med. 18(9), 863–868 (2016). [Online]. Available: https://​doi.​org/​10.​1038/​gim.​2015.​184
35.
Zurück zum Zitat D. Mrozek, Scalable big data analytics for protein bioinformatics, in Computational Biology (2018) D. Mrozek, Scalable big data analytics for protein bioinformatics, in Computational Biology (2018)
36.
Zurück zum Zitat S.C. Basak, M. Vracko, Big Data Analytics in Chemoinformatics and Bioinformatics: With Applications to Computer-Aided Drug Design, Cancer Biology, Emerging Pathogens and Computational Toxicology. (Elsevier, 2022) S.C. Basak, M. Vracko, Big Data Analytics in Chemoinformatics and Bioinformatics: With Applications to Computer-Aided Drug Design, Cancer Biology, Emerging Pathogens and Computational Toxicology. (Elsevier, 2022)
37.
Zurück zum Zitat R. Malviya, P.K. Sharma, S. Sundram, R.K. Dhanaraj, B. Balusamy, Bioinformatics Tools and Big Data Analytics for Patient Care. (CRC Press, 2022) R. Malviya, P.K. Sharma, S. Sundram, R.K. Dhanaraj, B. Balusamy, Bioinformatics Tools and Big Data Analytics for Patient Care. (CRC Press, 2022)
Metadaten
Titel
Big Data Analytics in Bioinformatics
verfasst von
Ümit Demirbaga
Gagangeet Singh Aujla
Anish Jindal
Oğuzhan Kalyon
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-55639-5_12

Premium Partner