Microsoft launched its Models aaS in Azure AI, GPT4.5 Leaks, Mistral LaPlatforme and Mixtral 8x7B and more AI news to close the 2023 cycle.

    — sources

    MS MaaS: https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/announcing-llama-2-inference-apis-and-hosted-fine-tuning-through/ba-p/3979227
    MaaS video: https://www.youtube.com/watch?v=GS5ZIiNqcEY

    Microsoft going nuclear: https://www.windowscentral.com/microsoft/microsoft-is-reportedly-eyeing-nuclear-energy-for-its-ai-ventures-following-the-techs-exorbitant-power-consumption
    nuclear job post: https://jobs.careers.microsoft.com/global/en/job/1627555/Principal-Program-Manager-Nuclear-Technology

    Intel unveils new AI chip: https://www.intel.com/content/www/us/en/newsroom/news/ai-everywhere-core-ultra-5th-gen-xeon-news.html
    CNBC reportage: https://www.cnbc.com/2023/12/14/intel-unveils-gaudi3-ai-chip-to-compete-with-nvidia-and-amd.html

    Mistral.ai La Platforme: https://mistral.ai/news/la-plateforme/
    Mixtral-8x7B: https://mistral.ai/news/mixtral-of-experts/

    Direct Preference Optimization: https://arxiv.org/abs/2305.18290
    Dolphin-2.6-mixtral-8x7b: https://huggingface.co/cognitivecomputations/dolphin-2.6-mixtral-8x7b
    4-bit: Mixtral-8x7B-v0.1-GPTQ

    SauerkrautLM-Mixtral-8x7B: https://huggingface.co/VAGOsolutions/SauerkrautLM-Mixtral-8x7B
    notux-8x7b-v1: https://huggingface.co/argilla/notux-8x7b-v1
    DeciLM-7B: https://huggingface.co/Deci/DeciLM-7B

    edge-SAM: https://github.com/chongzhou96/EdgeSAM

    GPT-4.5 leaks: https://www.reddit.com/r/OpenAI/comments/18i5n29/anyone_hear_of_gpt45_drop_today/
    daniel_nguyenx’s post: https://twitter.com/daniel_nguyenx/status/1735260556892967170
    BGR: https://bgr.com/tech/chatgpts-gpt-4-5-update-might-have-just-leaked-heres-what-we-know/
    “nah”: https://twitter.com/sama/status/1735422206296088950

    LayoutGPT: https://layoutgpt.github.io
    Compositional Visual Planning..: https://arxiv.org/abs/2305.15393

    Bytedance’s: StemGen: https://julian-parker.github.io/stemgen/#overview
    paper: https://arxiv.org/abs/2312.08723

    llm360.ai: https://www.llm360.ai/index.html
    blog: https://www.llm360.ai/blog/introducing-llm360-fully-transparent-open-source-llms.html

    DeepMind FunSearch: https://deepmind.google/discover/blog/funsearch-making-new-discoveries-in-mathematical-sciences-using-large-language-models/
    Math discoveries: https://www.nature.com/articles/s41586-023-06924-6

    Deep spatial-tempo resolution coarse: https://phys.org/news/2023-12-deep-spatial-temporal-resolution-coarse.html
    paper: https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2023EA002906

    RWKV: https://www.rwkv.com
    wiki: https://wiki.rwkv.com
    RWKV-LM: https://github.com/BlinkDL/RWKV-LM
    ChatRWKV: https://github.com/BlinkDL/ChatRWKV
    paper: https://arxiv.org/pdf/2305.13048.pdf

    Math perspective on Transformers: https://arxiv.org/pdf/2312.10794.pdf
    github: https://github.com/borjanG/2023-transformers-rotf

    Putin vs AI double: https://www.reuters.com/world/europe/putin-confronts-his-ai-double-2023-12-14/
    BBC reportage: https://www.bbc.co.uk/news/world-europe-67711802

    Hi guys it’s Simon C welcome to this new AI Rader as you might have noticed in last Raider I I started doing things a little differently and I talked about gemin Pro and Microsoft feed to model today I’m going to make a I’m going to record a longer video with more AI news

    Covered by a mix of slideshows and articles and videos and just to see how things go I did Microsoft ignite event Microsoft announced the upcoming preview of um mod as a service in azour AI Studio which azour AI studio is up priew itself in AO you have been able bble to

    Deploy models onto your your own infrastructure for a long time simply go into the model catalog say Microsoft select the model to deploy a virtual machine to deploy it on and you’re off the races but not every customer wants to think about operating infrastructure which is why iite we

    Introduced models as a service which operates models as API endpoints that you simply call much the way you might call the azour open AI service so basically which uh with model a service you can build generative artificial intelligence L large language model application with llms tools like

    Promp flow and semantic can long chain with minimal setup no manager of of GPU infrastructure and hosting are required today we taking aor step in making some of those models accessible to every organization we happy to announce model as a service en deoy of open mod ready to

    Apis the market organ subscri to different depending on the capacity they are planning to consume the new deployment doesn’t require provision deyas access usam weeling application on top of these model require more sophisticated which is why we integated them with your favorite tools like and okay that was a video about motorer

    Service um by Microsoft itself so the catalog in a AI includes flavors of metas Yama 2 and text and for text and chat completion and meta yamama 2 Matas yamama 2 is cheaper than GPT 3.5 and GPT 4 and the cat the catalog also includes open AI GPT 4 Turbo with

    Vision and and it will soon include Microsoft 32 2.7 B model which is a small language model for laptops and smartphones and Ora to 7 and 13 billion also from Microsoft um which features strong reasoning abilities so um Microsoft decided to go nuclear it partner with with uh helos to

    Uh create Nuclear Power Nuclear Power data centers for gener for demanded generative AI workloads indeed it all started from a job um post from Microsoft uh we’re looking for a principal program manager who will be responsible for maturing and implementing an energy strategy and Microsoft Al is also partnered with

    The Terra practice uh to train large language models to um with nuclear documentations and regulation to speed up data nuclear um process approval so Intel uh Z be new AI chip to compete with Nvidia and M AMD in’s AI chips are for PC and and data

    Centers U for PCS in Intel is launch Intel Core Ultra processors and in which npus are power in an initi line of D Microsoft and Lenovo computers which are already selling in the US only for data as for data centers uh Intel launched its fifth generation

    Intel Zeon uh chip which has npus and high artificial intelligence acceleration in each core it has also launched it is also Dem mode uh G it’s G 3 AI accelerator for generative AI which is it’s which is supposed to be a competitor to nvidia’s H 100 another processive unit

    Um int CL it’s n proc unit for um for laptop um guarantees 2 uh 2.5 better power effic efficiency than the previous generation but it would be interesting to see how it fails against other xpu the safe generation the of it’s of it’s it’s uh zon line it has has bit in acceleration

    And up to 42% higher infer performance and uh less than 100 m 100 millisecond latency on l l large language models with less than 20 billion parameters Intel says that customers will employ a mix of AI Solutions like Zoom did M La platform beta okay so mistra launched its new La

    Platform to bring uh it’s strongest open generative model to developers along with efficient ways to deploy and customize them for production um it mistal provides three end points three text to test end points mistal tiny small and medium and one one embedded end point M embed it also

    Provides uh a python or JavaScript client library for query of these end points here you can see the end points M tiny outmost cost effec the end point and M Small which serves MX 87b and Mr made you a prototype model so Mr tiny and Sayes mix mistal uh 7B instruct

    0.2 model uh uh text which works only in English and it is the cheapest model it features um it has M Small which features which serves uh mra’s newest model m 8 7B we about which I’m going to talk about later and Mr medium the highest quality endpoint it’s a

    Prototype uh it Masters English French Italian German Spanish and code Mr LED on the other hand um provides um um a 12,24 dimension embeddings and it has strong it has strong retrieval uh it has been designed with uh retrieval capabilities in mind then and now I’m going to talk about

    Mixol 8 7B it is a sparse mixture sparse mixture of expert model with open weights it was between on open web data it is a decoder only model it’s a F four block picks from eight distinct groups of parameters and and it allows it to increase this allow it to increase

    The number of parameters while controlling cost and latency it has a permissive license a pass 2.0 and you can assess it by the end point M Small in mral laform as we come to see Mr has the following capabilities it C fully handles a context of 30k tokens

    It handle it handles English French Italian German and Spanish it shows strong performance in code generation it can be fine to it in an instruction following model that achieves a score of 8.3 on the Mt bench so mix 8 7 B match is outperforms

    GPT 3 five three o sorry GPT 3.5 on most standard benchmarks and even outperforms often outperforms yma 27 70b and having a six time faster inference constant context language Ling mix lb is um additionally is less but it’s claimed to be less biased than yamu on the BBQ BQ Benchmark um

    MRA also serves mix 87b instruct an optimization for careful instruction for following um by means of supervised F fine tuning and direct preference optimization direct optimization uh it’s being discussed in a paper on archive.org DPO your language model is secretly uh reward model um it is a new parameterization of the

    Rob of the robot model and reinforcement learning by human feedback that enables extraction of the corresponding optimal policy en closed for to solve the standard a r l a problem with only a sample classification loss here you are mix toal eight 7B variants uh it’s uncensored variance dolphin 2.7

    2.6 m o 7B it is fine tune a fine tune 16k context it is great for coding here are mod sponsored by com AI um really good coding train with a lot of coding data very obedient but it is not DPO tuned as M 87b so you still

    Might need to cage the in the system pumpt as the auto shows in the example and then you have a a 4bit um implementation mix8 7B 0.1 gptq by uh huging face user the block here for bit with all the no group size to low virual rum requirements then you have

    Um a variant with um a mix of German data augmentation plus translated data s Co L language model mix trial uh h7b instruct which handles English German French Italian and Spanish n then you have not uh 87b uh version one uh fine tune high quality chest data and uh using direct preference

    Optimization it’s a fine tune or mixt instruct it outperforms other mix um mixture over expert based models in the ugin phase Hub this is part of the Noto family mod and experiments where the agila team so it is by it has been created by the some agila team where the agila team

    Investigates data first and preference tuning mode methods like DPO this DPO this model is the result of our first experiment tuning on Mr expert model that has already been fine tuned with Theo cool then you have this dolphin 2.6 F2 um um also sponsored by comi based on

    Microsoft F2 model it with no commercial use it provides no commercial use license it is uncensored so you have to implement um the alignment layer by yourself before exposing the model as a service then you have fe2 coder fine tune which has been fine tune to code Alpa 20K and an instruction

    Following uh data set used to find you the code alpaka model um it was F you using Kora with the PFT Library it is good for code generation and here you are they see they see large uh they see language model 7B it is a decoder only prain a regression

    Text generation language model of approximately 7 billion parameters D claims that it is not only the most accurate 7 billion based model it also spaces all models in its class with a throw put that is up to 4.4 times dat mix mistal 7 Bas this this I efficiently mode uses

    Variable group cure your attention to achieve a superior balance between accuracy computational efficiency and model architecture was generated using des’s proprietary neural neural architecture search technology auton neck um it supports n th000 token sequence length um it is being released on the dipash 2.0 license um and that’s

    It now let’s switch to computer vision with Edge Sam an accelerated segment anything model it is 40 up to 40 times faster it has the same encoder decoder AR architectures as same it is been optimized for age devices and it’s and it can be four time

    14 times faster than mobile s or Edge devices let’s check a demo okay so hm through a single tapping hm can sment out the object for you you can hold and drag the object to copy the to the other apps such as notes H it’s been run on iPhone 13

    Okay so hm run smoothly on smartphone um had more than 30 FPS on iPhone 14 so if your smart if your smartphone has no GPU and you have no problem in running it um checkpoints are available for downloads at hacking face space the approach it follows uh the the

    Approach in creating HS is it is being it’s a distilled Vision Transformer based same image encoder into pure CNN based architecture uh you can export it on to Apple cor machine learning model or on nnx models on X User Daniel nug again uh link to already threade on

    Price of input and output tokens of some laked GPT 4.5 um rumors uh about where rot spread about uh GPT 45 GPT 4.5 heav multi models capabilities across Vision video audio language and three dimension let’s check um bgf provide a complete breakdown the price in shared in the

    Draft and somehow man Deni the rumors the rumors on X by simply posting n here I am back from Miranda and I’m talking about layout GPT um for compos for compositional visual planning and generation with large language model um for image layout generation based on text input specifically

    Um the paper has been featured has been featured at NOS 2023 which this year was basically um hiring frenzy because people um with the Rhythm and machine learning is developing and this year people are were recruiters and engineers and not waiting for NOS um to know the latest development developments about machine

    Learning okay back to the point um they out GPT has high degree guarantees High degree of user controllability um by means by using large language models and visual planners um to output plausible layout in multiple domains from two dimensional images to three dimensional Indo SC let’s see if I can uh find the

    Paper attaining High degree of us controlability in visual generation of a often requires intricate fine grin inputs like layouts however search inputs impose a substantial burden on users when compared to the simple text inputs to address this the issue we study how large language models can Ser as visual

    Planners by generating layouts from text condition and thus collaborate with with Genera models researchers claim that combined with a downstream image generation model layout GPT output forms text to image model system by 20 40% and achieves comparable performance as human uses in design and visual layouts for numerical and spatial

    Correctness let’s check the project site so here you can see a group of three okay you can see the uh visualization B dimensional visualization um sorry B dimensional visual generation and here they um threedimensional indoor synthesis using layout GPT as for to the B dimensional image layout uh layout gbt can apply numerical

    Reasonance skills of large language models into layout generation learn special concept spatial concept through in context demonstration uh it provides application scenarios to Natural advantages of using large language model for image layout generation attribute binding assign current attributes to The Binding boxes text base in painting imagine and expand

    The under specified description of satain objects as for threedimensional scene synthesis the GPT shows comparable performance supervis methods in indorsing generation conditioned or room type INF and flow PL size it also provides say complete it’s Auto regressive manner enables complete complete to complete a partial scene okay Alibaba Alibaba um

    Release the paper about release a stem gen a music generation model that listens to musical context and responds appropriately uh you can even have um day paper posted to archive.org and to and generation music or audio using deep learning techniques as in an explosion activity recently but most models concentrate on generating fully

    Mixed music music and respon to exrt condition information uh we present an alternative paradig for producing music generation models that can to M context so it’s architecture stem J’s architecture is similar to sound stor Vamp net um in that it is non autor regressive Transformer language model which is been trained

    With open source and proprietary data it was thoroughly evaluated with um with this metric freshet freshet audio distance and music information retrieval the scripto dist distance okay um that quanty Competitive Audio quality strong musical alignment with musical context and it would be great beneficial to IND the game developers you can listen

    To to examples On Julian Parker website I suppose say some um engineer work at at Alibaba so you provide starting points and stem generates music um and then I find out about um llm 360 AI which is introd that introduced a framework for open source large language model fully transparent open- Source large language

    Model to for the transparency trust collaborative research okay um llm 360 provides um like other releases and model weights and evaluation results but additional information is often needed to genuinely understand the Models Behavior and this information is not typically available to most researchers so the organization committed to release all the

    Intermediate checkpoints up to three60 collected during training and all the TR all of the training data need it’s mapping the checkpoints all collected Matrix uh like loss um Norm evaluation results and all all source code for per processing data and model training and and as well as logs and metrics all the

    Training logs evaluation and analys results collected during trainings during training are publicly disclosed also in correspondence to the training steps and sequence so the basis of llm 360 is to create a Frameworks that encourages openness and research collaboration for large language models okay the first two models to be released on

    The llm 360 are um and Crystal code um to be released under the apach 2.0 license number is a 7B 7 billion parameter English model train on 1.2 trillion tokens with uh 360 checkpoints and Crystal code 7 million parameter code models trained upon 1.4 trillion tokens um which is strong code and tax

    Generation lmlm 360 helps researches prediction deeper to to get a deeper look into uh large language models construction process and to conduct research uh research uh for analyzing model Dynamics so Deep Mind released fun search um the fun search model making new discoveries in mathematical science mathematical Sciences using large language

    Models allow it made the first discoveries um in open problems in mathematics using large language models by searching for function written in computer code pH search discovered new solution for uh the cap set problem and more effective algorithms for the Bean packaging problem in improv data center

    Efficiency so the fun search process is the following the large language model is shown a selection of the best programs it has generated so far and it is asked to generate an even better one and the pro program proposed by large language model are automatically executed and evaluated the pr programs

    Are added to the database for selection in subsequent Cycles the user can at any point retrieve the highest scoring progam discovered so far so and now this one the car through Institute of Technology developed um Spade gun and released its related study uh on health and space science sped gun

    Spatial temporal down scal of R for Fields using a conditional gun gun approach so this SP G neuron Network guarantees higher resolution for more precise Regional climate models the goal is for to prepare for disasters at at an early stage um a strong precipitation can cause floodings and landslides but Global

    Climate models for forecastings are based on low resolu ution low resolution grades um since precipitations are highly variable in space and time the solution they rely to is down scaling so uh generative AI model incre the the the kit generative AI model increases resolution from spatial in spatial um

    Terms from 32 kilm to 2 kilm and um in the temporal domain from 1 hour to 10 minutes so what what is the approach followed to build the Spade gun um Spade gun basically is um a three dimensional convolutional conditional generative of the network uh in which a

    Generative and both gener the generator and discriminator are convolutional neuron networks um and the speed count relies to video super resolution and it’s been trained all on U on 10 years of rad observation um for Germany and compared to other models Spade can has uh it’s been trained using an evolving no predefined

    Expert metric and it generates an assemble of an assemble of high resolution Solutions um um guaranteeing low computational effort compared to dynamically down scaling approaches the result is temporal consistent high resolution rainfall field here you can see the lower resolution grid and this is this Spade gun reconstruction uh with higher

    Resolution let’s talk about an LF AI project the RW KV language model which is um our current neuron Network feature in uh it’s in in its related study R wkv Reinventing recuring nural networks for the Transformer era you can see it study on archive.org where R is the receptance receptance

    Vector which acts as the rece of past information and W the weight signifies the positional weight the K Vector a trainable parameter within the mod K is the key Vector performing a role analogous to K in traditional attention mechanism and V the value Vector functions similarly to V in conventional ATT tension

    Processes here you can see the stting so the problem with Transformers uh is that they suffer from memory computational complex that scales quadratically with sequence length length in contrast recurent neural networks exhibit linear scaling in memory and computational requirements but struggle to match the same performance and Transformers due to limitation in

    Parallelization scalability so researches um proposed a noal model architecture receptance weighted key value that combis the efficient paral parallelizable training of transform with deficient inference of recurrent neuron networks so uh they approach Leverage is a linear linear tension mechanism and allowing to formulate the model as either a Transformer or under current neur

    Networks that’s parallelizing computation during training and maintaining Conant computational and memory complexity during inference um it is 100% attention free and um model with GPT level large language model performance and can be trained like like uh a parallelizable GPT Transformer um it’s been scaled uh from 169 million to up to 14 billion

    Parameters Um they even on the website they provide Um they provide to links to deos uh 1.5 billion parameters and three bil parameters to a phas you have here you have uh groud your demo so you can um tweak parameters um to get different outputs okay okay it offers great performance fast inference fast training saves virtual Ram it guarantees

    Infinite um context length and free text embedding so on the wiki page there is a compare they provide a a comparison against existing Transformer models you can see here 20 times 10 times to 100 times lower computer requirements scales to any context length linearly that’s why they they say they

    Can provide infinite context L context length perform just as well in terms of on quality capability generally better generally better training other languages um the conses are it is sensitive to prop formatting and it’s weaker at tasks that require load look back so you have your to reorder your prompt accordingly

    Accordingly here you have the archive paper um they on they repo GitHub repo so they provide the language model ex itself Um to test it and to F tun it and even um chat uh chat awk View which is like chat gbt but Power by that WK model language model blah blah blah blah oh this is hard and then you have um um this paper mathematical perspective or Transformer in which

    Transformer can be interpreted like a nonlinear meanfield interacting particle systems um in which clusters emerge in long time uh the paper provide um a simplified modeling with cotation mechanism layer normalization um through a continuity equation uh Transformer uh turns to a flow map from initial to to terminal this throws of particles

    Um the proof of appearance of clusters um is provided by is guaranteed by numerical experiments with pre-trade models Um and then you have code visualization in GitHub in the GitHub uh raple here you can see the particle cluster into cluster to three points and even to a single point under the natural language process interpretation particle uh can be are tokens Um and it corroborates the word attention mechanism inter token attraction uh the inner product between tokens is a measure measure of their semantic similarity Putin confronts e AIW reuter says that there is some recurrent speculation by Western media uh stating that Putin has one or more

    Body doubles uh due to a health problems okay and writers and even the BBC um are saying that so um a student so Russian president Vlad Putin appeared briefly lost wordss for on Thursday 14th December 14 when confronted with an i generated version of itself say Reuters the double took the

    Opportunity to put a question to pu in about artificial intelligence during an annual news conference where dozen of colors from around the country were hooked up to the president by video link Vladimir vladimirovich hello I’m a student at s Petersburg State University I want to ask is it

    True he have a lot of doubles the double ask prompting laughter among the audience in the hall with with Putin in Moscow for so Vladimir vladimirovich answer that only one person must be like me and speak with my voice and that will be me and uh Russia should become a war leader in

    AI BBC orted oh that was hard okay guys it’s all for today I don’t know if you’re going to like this video if you if you’re going to like it hit the Subscribe button and subscribe to my channel it’s all for today see you next time bye

    1 Comment

    1. My system will end up needing about 10 queries per second, which even at Gemini prices is around $200/hour. So this stuff really needs to come down in price for less serious use cases like games. My only choice now is to grab any free queries I can from any promotions and just crank out transfer learning examples.

    Leave A Reply